Genomic nucleic acids, cDNA and mRNA which code for polypeptides with IL-16 activity, processes for the production thereof and their use

ABSTRACT

A nucleic acid with which expression of a polypeptide having interleukin-16 activity can be achieved or regulated in a prokaryotic or eukaryotic host cell, wherein in the region coding for the polypeptide, the nucleic acid (1) corresponds to the DNA sequence SEQ ID NOS:1 and 5-7 or its complementary strand; (b) hybridizes under stringent conditions with the DNA of sequence SEQ ID NOS:1 and 5-7; (c) or is a nucleic acid sequence which, if there was no degeneracy of the genetic code, would hybridize under stringent conditions with the nucleic acid sequences defined by (a) and (b); (d) and, if it codes for a polypeptide having interleukin-16 activity, is suitable as genomic DNA for the recombinant production of IL-16 in eukaryotic host cells and processes for producing the same and pharmaceutical compositions containing the same.

The invention concerns regulatory elements of the expression of IL-16,genomic nucleic acids, cDNA and mRNA that code for polypeptides withIL-16 activity, processes for the production thereof and their use.

IL-16 (interleukin-16) is a lymphokine that is also denoted “lymphocytechemoattracting factor” (LCF) or “immunodeficiency virus suppressinglymphokine” (ISL). IL-16 and its properties are described in WO94/28134, the International Application PCT/EP96/01486 as well as byCruikshank, W. W., et al., Proc. Natl. Acad. Sci. U.S.A. 91 (1994)5109-5113 and Baier, M., et al., Nature 378 (1995) 563. The recombinantproduction of IL-16 is also described therein. According to this IL-16is a protein with a molecular mass of 13,385 D. Cruikshank also foundthat ISL elutes as a multimeric form with a molecular weight of 50-60 or55-60 kD in molecular sieve chromatography. The chemoattractant activityis attributed to this multimeric form which is a cationic homotetramer(product information AMS Biotechnology Ltd., Europe, Cat. No. 11177186).Baier describes a homodimeric form of IL-16 with a molecular weight of28 kD. However, the chemoattractant activity described by Cruikshank etal., in J. Immunol. 146 (1991) 2928-2934 and the activity of recombinanthuman IL-16 described by Baier are very low.

The object of the present invention is to provide regulatory elements ofIL-16 expression, to improve the activity of IL-16 and to provide formsof IL-16 which exhibit low immunogenicity and are advantageouslysuitable for therapeutic use.

The object of the invention is achieved by a nucleic acid with whichexpression of a polypeptide having interleukin-16 activity can beachieved or regulated in a eukaryotic host cell, wherein the saidnucleic acid

a) corresponds to the DNA sequence SEQ ID NO:1 or its complementarystrand;

b) hybridizes under stringent conditions with the DNA of sequence SEQ IDNO:1, preferably with nucleotides 1-6297 of SEQ ID NO:1;

c) or is a nucleic acid sequence which, if there was no degeneracy ofthe genetic code, would hybridize under stringent conditions with thenucleic acid sequences defined by a) or b),

d) and, if it codes for a polypeptide having IL-16 activity, has alength of at least 1179 coding nucleotides.

A preferred sequence is the cDNA sequence shown in SEQ ID NO:6, thecomplementary strand thereof or a sequence which under stringentconditions hybridizes with the sequence SEQ ID NO:6. SEQ ID NO:5 and theplasmid pCI/IL16 PROM also describe the genomic DNA of IL-16 and containthe introns and exons each parity or completely.

Such a nucleic acid preferably codes a polypeptide with IL-16 activity,particularly preferably the natural IL-16 of primates such as humanIL-16 or IL-16 of a species of monkey or another mammal such as e.g.mouse.

It surprisingly turned out the FIG. 2 of WO 94/28134 does not describethe complete sequence of IL-16. The start codon “ATG” of the precursorform of the protein does not begin with nucleotide 783. The sequence hasyet more differences to FIG. 2 of WO 94/28134. These are for examplenucleotide substitutions (313 G by A, 717 C by A, 1070 G by T).

The sequence of IL-16 can differ to a certain extent from proteinsequences coded by such DNA sequences. Such sequence variations can beamino acid substitutions, deletions or additions. However, the aminoacid sequence of IL-16 is preferably at least 75% especially preferablyat least 90% identical to the amino acid sequence of IL-16. Variants ofparts of the amino acid sequence or nucleic acid sequence are forexample described in the International Patent Application Nos.PCT/EP96/01486, PCT/EP96/05662 and PCT/EP96/05661.

Nucleic acids within the sense of the invention are for exampleunderstood as DNA, RNA and nucleic acid derivatives and analogues.Preferred nucleic acid analogues are those compounds in which the sugarphosphate backbone is replaced by other units such as e.g. amino acids.Such compounds are denoted PNA and are described in WO 92/20702. Sincefor example PNA-DNA bonds are stronger the DNA-DNA bonds the stringentconditions for PNA-DNA hybridization described in the following are notapplicable. Suitable hybridization conditions are, however, described inWO 92/20703.

SEQ ID NO:6 describes the cDNA derived from the mRNA. The cDNA issuitable, for instance, for the determination of the corresponding RNAin tissue fluids and body fluids of mammals and humans. The cDNA ispreferably used, however, for the expression of full length IL-16 inprokaryotes, preferably in E.coli. For that purpose the cDNA is insertedinto an appropriate vector, transformed into a prokaryotic host cell,said host cell is cultivated, and, after cultivation, IL-16 is isolated.This can be done according to the methods known to one skilled in theart. If the protein is not secreted but obtained within the cell as adenatured insoluble protein (inclusion bodies), solubilisation andnaturation must be carried out thereafter. These methods are also knownto one skilled in the art.

SEQ ID NO:7 describes the amino acid sequence of IL-16 in its precursorform, which is also a subject matter of the invention.

The term “IL-16” within the sense of the invention is understood as apolypeptide with the activity of IL-16. IL-16 preferably exhibits theeffect stated in the International Patent Application No. PCT/EP96/01486or it stimulates cell division according to WO 94/28134.

IL-16 binds to CD4⁺ lymphocytes and can suppress the replication ofviruses such as for example HIV-1, HIV-2 and SIV. The function of IL-16is not limited by its presentation in the MHC complex.

IL-16 in particular exhibits one or several of the following properties:

binding to T cells via the CD4 receptor,

stimulating the expression of the IL-2 receptor and/or HLA-DR antigen onCD4⁺ lymphocytes,

stimulating the proliferation of T helper cells in the presence of IL-2,

suppressing the proliferation of T helper cells stimulated with anti-CD3antibodies,

suppressing the replication of viruses preferably HIV-1, HIV-2 or SIV.

The term “hybridizing under stringent conditions” means that two nucleicacid fragments hybridize with one another under standardizedhybridization conditions as for example described in Sambrook et al.,“Expression of cloned genes in E. coli” in Molecular Cloning: Alaboratory manual (1989), Cold Spring Harbor Laboratory Press, New York,U.S.A. Such conditions are for example hybridization in 6.0×SSC at about45° C. followed by a washing step at 2×SSC at 50° C. To select thestringency, the salt concentration in the washing step can be selectedfor example between 2.0×SSC at 50° C. for low stringency and 0.2×SSC at50° C. for high stringency. In addition the temperature in the washingstep can be varied between room temperature ca. 22° C. for lowstringency and 65° C. for high stringency.

A “regulatory element” is understood as a DNA sequence which regulatesthe expression of genes (e.g. promoter, attenuator, enhancer). Apromoter is understood as a cis-acting DNA sequence which is usually80-120 base pairs long and is located 5′ upstream of the initiation siteof the gene to be expressed. A promoter is in addition characterized inthat RNA polymerase can bind to it and can initiate the correcttranscription. A preferred DNA fragment with promoter activity spansnucleotides 2053-3195 of SEQ ID NO:1.

An enhancer is usually understood as a cis-acting DNA sequence of ca.50-100 bp in length which is of paramount importance for an efficienttranscription. Enhancer sequences work independently of orientation andposition.

An intron is understood as a nucleotide sequence which is present ineukaryotic genes and is transcribed into pre-mRNA and is removed fromthe mRNA in a further step (splicing). The IL-16 gene contains severalintrons and exons which are described in SEQ ID NO:1, pCI/IL16 PROMand/or SEQ ID NO:5.

Plasmid pCI/IL-16 PROM contains a sequence upstream of SEQ ID NO:5. SEQID NO:5 describes the 3′ terminal part of the genomic DNA whereas theplasmid describes the 5′ terminal part. Both sequences overlap in theregion of nucleotide 1 to nucleotide 87 of SEQ ID NO:5. Thus the plasmidcontains the IL-16 sequence 5′ upstream of nucleotide 87 of SEQ ID NO:5.These are coding sequences and regulatory elements as well as one orseveral introns. In the first intron of SEQ ID NO:5 about 600 base pairsare missing at the position denoted “N”. These nucleotides can either bedeleted or filled up by any nucleotides. However, it is important thatthe intron/exon junctions remain correct. The order of these base pairsis shown in SEQ ID NO:1.

A further subject matter of the invention are regulatory elements of theexpression of IL-16 (in particular promoter and enhancer elements asthey are present on the plasmid pCI/IL-16 PROM or in SEQ ID NO:1/SEQ IDNO:5 or can be derived therefrom). Promoter elements are at the 5′ endupstream of exon 1. The enhancer elements are on the 5′ side of theIL-16 gene to be expressed in the said plasmid as well as at the 3′ endof SEQ ID NO:1/SEQ ID NO:5.

The regulatory elements according to the invention are particularlysuitable for the expression of polypeptides with IL-16 activity ineukaryotic cells. The regulatory elements are, however, also suitablefor expression of other genes in eukaryotic cells. The regulatoryelements are particularly advantageous for expression in lymphocytes inparticular in T lymphocytes and cells or cell lines derived therefrom.Suitable regulator sequences can be selected as described in example 7.

IL-16 is preferably recombinantly produced in eukaryotic host cells.Such production methods are known to a person skilled in the art and arefor example described in EP-B 0 148 605. However, in order to obtain theforms of IL-16 according to the invention by recombinant production in adefined and reproducible manner, additional measures have to be takenbeyond the processes for recombinant production familiar to a personskilled in the art. For this a DNA is firstly prepared which is able toproduce a protein that has the activity of IL-16. The DNA is cloned intoa vector that can be transferred into a host cell and can be replicatedthere. Such a vector contains regulatory elements that are necessary toexpress the DNA sequence in addition to the IL-16 sequence. One orseveral regulatory elements contained in SEQ ID NO:1 are preferablyused. Such a nucleic acid which contains the regulatory elements istransferred into a vector which is capable of expressing the DNA ofIL-16. The host cell is cultured under conditions that are suitable forvector amplification and IL-16 is isolated. In this way suitablemeasures ensure that the protein can adopt an active tertiary structurein which it exhibits IL-16 properties.

A lymphoid expression cell line is preferably used instead of the usualhost cells (COS, CHO). In this connection IL-16 may be processed intothe active shortened form.

In this process it is not necessary that the expressed protein containsthe exact IL-16 amino acid sequence from SEQ ID NO:7. Proteins areequally suitable which contain essentially the same sequence and haveanalogous properties. A eukaryotic expression using the regulatoryelements and/or the genomic DNA of IL-16 ensures that IL-16 is correctlyprocessed. In this way a protein is obtained in a recombinant mannerwhich is essentially or completely identical to natural IL-16.

The nucleic acid sequence of the protein can be modified. Suchmodifications are for example:

Modification of the nucleic acid in order to introduce variousrecognition sequences of restriction enzymes to facilitate the steps ofligation, cloning and mutagenesis

modification of the nucleic acid to incorporate preferred codons for thehost cell

extension of the nucleic acid by additional operator elements in orderto optimize the expression in the host cell.

In addition the expression vectors usually contain a selectable markerin order to select the transformed cells. Such selectable markers arefor example the DHFR gene, the resistance genes for ampicillin,chloroamphenicol, erythromycin, kanamycin, neomycin and tetracyclin(Davies et al., Ann. Rev. Microbiol. 32 (1978) 469). Selectable markerswhich are also suitable are the genes for substances that are essentialfor the biosynthesis of substances necessary for the cell such as e.g.histidine, tryptophan and leucine.

Further genetic engineering methods for the construction and expressionof suitable vectors are described in J. Sambrook et al., MolecularCloning: A laboratory manual (1989), Cold Spring Harbor LaboratoryPress, New York, N.Y.

Recombinant IL-16 can be expressed in eukaryotic cells such as forexample CHO cells, yeast or insect cells. CHO cells, COS cells or hostcells derived from lymphocytes (preferably from T lymphocytes) arepreferred as the eukaryotic expression system. Expression in yeast canbe achieved by means of three types of yeast vectors: integrating YI_(P)(yeast integrating plasmids) vectors, replicating YR_(P) (yeast repliconplasmids) vectors and episomal YE_(P) (yeast episomal plasmids) vectors.More details of this are for example described in S. M. Kingsman et al.,Tibtech 5 (1987) 53-57).

A further subject matter of the invention is a eukaryotic host cellwhich is transformed or transfected with a nucleic acid that codes foran IL-16 polypeptide according to the invention in such a way that thehost cell expresses the said polypeptide. Such a host cell usuallycontains a biological functional nucleic acid vector, preferably a DNAvector e.g. a plasmid DNA that contains this nucleic acid.

A further subject matter of the invention is human interleukin-16 orinterleukin-16 from primates preferably human IL-16 which can beobtained essentially free of other human proteins as a correctlyprocessed product of a eukaryotic expression. IL-16 is a protein thatoccurs as a monomer or as a multimer composed of monomeric IL-16(denoted subunits in the following). The molecular weight of a monomericIL-16 subunit is preferably ca. 14 kD. In addition a monomeric IL-16polypeptide is preferred which cannot be cleaved into further subunits.

It surprisingly turned out that the nucleic acid and protein sequence ofIL-16 described in WO 94/28134 do not correspond to the natural humansequences. This is merely an IL-16 fragment. However, for therapeuticuse it is preferable to use a correctly processed protein which iseither identical to the natural protein or only differs slightly fromthe natural protein and exhibits at least a comparable activity andhence low immunogenicity.

Within the sense of the invention the nucleic acid sequence of IL-16 cancontain deletions, mutations and additions. An IL-16 (monomeric form,subunit) that is coded by such a nucleic acid can be multimerized in apreferred embodiment. In this way the activity of IL-16 can beincreased. Such multimeric forms are preferably dimeric, tetrameric oroctameric forms.

In a further preferred embodiment polypeptides of the invention canadditionally contain a defined content of metal ions wherein the numberof metal ions per subunit is preferably 0.5 to 2.

Within the sense of the invention many metal ions are suitable as themetal ions. Alkaline earth metals as well as elements of side groupshave proven to be suitable. Particularly suitable are alkaline earthmetals, cobalt, zinc, selenium, manganese, nickel, copper, iron,magnesium, potassium, molybdenum and silver. The ions can be monovalent,divalent, trivalent or quadrivalent. Particularly preferred are divalentions. The ions are preferably added as solutions of MgCl₂, CaCl₂, MnCl₂,BaCl₂, LiCl₂, Sr(NO₃)₂, Na₂MoO₄, AgCl₂.

Such multimeric forms and forms of IL-16 containing metal ions aredescribed in the International Patent Application No. PCT/EP96/05661.

The polypeptide according to the invention can be produced in such a waythat a eukaryotic host cell which is transformed or transfected with anucleic acid according to the invention is cultured under suitablenutrient conditions and the desired polypeptide is optionally isolated.If the polypeptide is to be produced in vivo as part of a gene therapytreatment, the polypeptide is of course not isolated from the cell.

In addition the invention concerns a pharmaceutical composition whichcontains a polypeptide according to the invention in an adequate amountand/or specific activity for a therapeutic application as well asoptionally a pharmaceutical suitable diluent, adjuvant and/or carrier.

The polypeptides according to the invention are particularly suitablefor the treatment of pathological states that have been caused by viralreplication especially retroviral replication and for immunomodulation.Such therapeutic applications are also described in WO 94/28134 as wellas in the International Patent Application No. PCT/EP96/01486.Diagnostic test procedures are also described in the latter.

The polypeptides according to the invention can preferably be used forimmunosuppression. This immunosuppression is preferably achieved byinhibiting the helper function of TH_(O) and/or TH₁ and TH₂ cells. Thepolypeptides according to the invention are therefore of therapeuticvalue in all diseases in which an immunodys-regulatory component ispostulated in the pathogenesis, in particular a hyperimmunity. Diseaseswhich can be treated with IL-16 can be diseases in cardiology/angiologysuch as myocarditis, endocarditis and pericarditis, in pulmonology theseare for example bronchitis, asthma, in hematology autoimmune neuropeniaand graft rejection, in gastroenterology chronic gastritis, inendocrinology diabetes mellitus type I, in nephrologyglomerulonephritis, diseases in the field of rheumatoid diseases,diseases in opthalmology, in neurology such as multiple sclerosis, indermatology such as eczema. The polypeptides according to the inventioncan in particular be used in autoimmune disease, allergies and to avoidgraft rejections.

A further subject matter of the invention is the use of nucleic acidsaccording to the invention in the field of gene therapy. Vector systemsthat are suitable for this are for example retroviral or non-viralvector systems.

The following examples and publications as well as the sequence protocolare intended to elucidate the invention, the scope of which ischaracterized by the patent claims. The methods described are to beunderstood as examples which also after modifications still describe thesubject matter of the invention.

The plasmid pCI/IL 16 PROM was deposited on the 26.03.96 under the No.DSM 10603 at the “Deutsche Sammlung von Mikroorganismen und ZellkulturenGmbH (DSM)”, Mascheroder Weg 1b, D-38124 Braunschweig.

General

Cells

Human peripheral blood mononuclear cells (PBMC) were isolated by FicollHypaque gradient centrifugation and cultivated in RPM 1640 mediumsupplemented with 20% fetal calf serum, 100 units/ml IL-2 and 5 μg/mlphytohemagglutinin (PHA). T lymphocyte subsets were prepared asdescribed by Ennen, J., et al., Proc. Natl. Acad. Sci. U.S.A. 91 (1994)7207-7211.

RNA preparations and Northern blotting

Total RNA was extracted using the RNA-Isolation Kit (Stratagene, LaJolla, U.S.). Poly (A⁺) RNA was isolated from total RNA with theOligotex-dT mRNA system (Qiagen, Hilden, DE). 10 μg of total RNA or 2 μgof Poly (A⁺) RNA were loaded on a formaldehyde agarose gel and afterelectrophoresis blotted onto a positively charged nylon membrane(Boehringer Mannheim GmbH, DE). The IL-16 cDNA probe was generated usingthe PCR DIG-probe synthesis system (Boehringer Mannheim GmbH, DE) andspans the IL-16 cDNA region from nucleotide 1693 to the end of thereading frame at nucleotide 2082. Hybridizations were carried out at 58°C. overnight followed by several high stringency washes. For detectionof the signals the DIG luminescent system (Boehringer Mannheim GmbH, DE)was employed according to the manufacturer's recommendations. Thequality of RNA preparations was routinely assessed by hybridization witha 13-actin probe. The Human RNA master blot (Clontech Laboratories, PaloAlto, U.S.) was analysed with the same IL-16 cDNA hybridization probeunder comparable conditions.

Reverse transcription and PCR

Identification of the 5′ end of IL-16 precursor mRNA was performed usingthe 5′ RACE system for rapid amplification of cDNA ends (LifeTechnologies, Gaithersburg, U.S.). Additional RACE experiments werecarried out with parts of the CapFinder system and Marathon-Ready cDNAsfrom human lymph nodes, leukocytes and murine leukocytes respectively(all from Clontech Laboratories, Palo Alto, U.S.). All other cDNAs weresynthesized using up to 5 μg of total PBMC RNA and oligo-dT as primer(Pharmacia, Uppsala, SE).

Gel purified PCR products were ligated into the pGEM-T vector (Promega,Madison, U.S.) before determination of the nucleotide sequencesaccording to standard methods.

Peptides and antibodies

Antibodies specific for pro-IL-16 coupled to KLH via disulfide bondswere raised in rabbits. The antisera and the peptides were obtained fromthe Custom peptide antibody production program (Eurogentec, Seraing,BE). Recombinant IL-16 (rIL-16His) was used to raise antisera in goats(Baier, M., et al., Nature 378 (1995) 563). Affinity purified goatanti-IL-16 antibodies were used at an IgG concentration of 0.25 μg/ml inimmunoblot experiments.

Immunoblots

Cell lysates were prepared by incubation of 2.5×10⁷ cells in 400 μl ofsolubilization buffer (20 mM Tris HCl, pH 7.5, 1% NP-40, 150 mM NaCl, 5mM EDTA, 1 mM phenylmethyl-sulfonyl-fluoride, 10 mM sodium fluoride, 1mM sodium pyrophosphate, 5 μg/ml aprotinin and 5 μg/ml leupeptin) for 15minutes on ice. Nuclei were removed by centrifugation and the volume wasfinally adjusted to 500 μl with 4×SDS sample buffer. Immunoblots werecarried out according to standard protocols. Antisera were used atappropriate dilutions in blocking buffer (phosphate buffered saline(PBS), pH 7.2, 5% Marvel). Finally, bound antibody was detected usingthe enhanced chemoluminescence (ECL) kit (Amersham, Little Chalfont,UK).

Proteolytic cleavage of pro-IL-16 in cell lysates

Purified CD8(+) cells were lysed after cultivation for 2 days byincubation in PBS-Dulbecco/1% NP-40 for 10 minutes on ice. Lysates wereclarified by centrifugation and finally diluted 1:5 in PBS. Theequivalent of 4.5×10⁶ cells was incubated with 30 μg rIL-16His in avolume of 66 μl for 1648 hours at room temperature. Thereafter, thecleavage of rIL-16His was analysed by immunoblotting with 1:100 dilutedserum 802, which recognizes the carboxyterminus of IL-16.

EXAMPLE 1

A 2.6 kb mRNA is the main IL-16 transcript and is predominantlyexpressed in lymphatic tissues.

IL-16 mRNA expression in PBMC was examined by Northern blotting using afragment corresponding to the carboxyterminal 390 bp as hybridizationprobe under stringent conditions. The major transcript in PBMCs is of2.6 kb length.

The human RNA tissue blot allows the direct quantitative comparison ofgene expression in 50 different tissues. IL-16 mRNA was detectable atequally strong levels in spleen, thymus and lymph node samples.Significantly lower levels of expression were seen in peripheralleukocytes, bone marrow, fetal spleen, fetal thymus, stomach and thecerebellum. Only traces of IL-16 specific mRNA were found in appendix,occipital lobe, salivary gland and mammary gland tissue. Thus, 13 out of50 tissues scored positively in this hybridization analysis. From the 13positive tissue samples 8 were of lymphatic origin including those withthe highest expression levels.

Example 2

Identification of the IL-16 mRNA 5′ end by different RACE approaches

To confirm that the IL-16 precursor mRNA is indeed larger thanpreviously described, 5′ RACE experiments were conducted using anoligonucleotide located in a known region of pro-IL-16 as primer for thecDNA synthesis. The cDNAs were tailed by use of the terminaldeoxynucleotidyl transferase and finally amplified with PCR primersspecific for IL-16 and the homopolymeric dC-tail. To enhance thespecificity of these PCR amplifications in some cases secondary nestedor semi-nested PCRs prior to cloning the products were performed,Surprisingly, the IL-16 cDNA sequence extends almost 1 kb beyond itspreviously published 5′ end. The CapFinder system displays higherselectivity for reverse transcripts of complete mRNAs (Zhu, Y., et al.,Clontechniques 11 (1996) 30-31) and was therefore used in combinationwith IL-16 specific primers for cDNA synthesis to extend and confirm thesequence information obtained with the tailing approach. With bothmethods the 5′ end of the pro-IL-16 cDNA was found to consist of afairly heterogenous set of transcriptional starting points. Similarresults were obtained with commercially available cDNAs from human lymphnodes, leukocytes and murine leukocytes respectively. Incompletelyprocessed transcripts were also found containing parts of an intron justupstream of the putative initiation codon.

Example 3

Sequence of the IL-16 precursor

The complete nucleotide and deduced amino acid sequence of the pro-IL-16cDNA is shown in SEQ ID NO:6. The first ATG is found at position 190 andif used as a start codon would result in a relatively hydrophylicprotein of 631 amino acids with a calculated molecular weight of 67 kDa.Use of the second ATG at position 271, which is in frame and in verygood context for the initiation of translation, would give rise to a 63kDA protein. The putative 5′ leader sequence is G-C rich (61%) and maycontribute to the formation of secondary structures.

The most significant homologies in database searches were seen to thepresynaptic density protein 95 and the tight junction protein ZO-1(Genbank accession numbers P31016 and Q07157 respectively). Both carryGLGF motifs and the resemblance is solely due to the three GLGF domainsin the carboxyterminal half of the pro-IL-16 sequence.

Database searches for the 3′ untranslated region (UTR) of the pro-IL-16cDNA, which has a proposed length of 979 bp, were also carried out. Thethree expressed sequence tags (Genbank accession numbers N38840, H57532and N22522) that cover the 3′ end were found to begin in a regionbetween nucleotides 2371 and 2385 due to utilization of thepolyadenylation signal at nucleotide position 2353. Thus, for themajority of transcripts the 3′ UTR of approximately 303 nucleotideswould be significantly shorter than previously described.

Example 4

Immunoblot detection of pro-IL-16 in cell lysates

The IL-16 precursor protein was detectable in mitogen stimulated PBMCsas a protein band with an apparent molecular weight of about 80 kDa. Infreshly isolated as well as in serotonin stimulated cells an almostequally strong second 60 kDa band was seen. Only overexposure of filmsallowed detection of the same 60 kDa protein in the samples after 2 daysof cultivation in the presence of mitogen and IL-2.

To verify that the pro-IL-16 cDNA can be expressed the pro-IL-16 codingregion under transcriptional control of the CMV promoter (pcDNA3,Invitrogen, San Diego, U.S.) was transfected into COS-7 cells.Immunoblots with lysates from transfected cells and PBMCs revealed thattransfected cells, but not untransfected controls, express an 80 kDaprotein which migrates in a manner identical to the pro-IL-16 found inPBMCs. Similar results were seen with pro-IL-16 specific serum 804whereas use of pre-immunization sera gave no signals.

Example 5

Proteolytic cleavage of pro-IL-16 in cell lysates

The IL-16 precursor protein should be a substrate for proteases presentin or on CD8(+) cells, the action of which would release thebiologically active carboxyterminal portion. A recombinant pro-IL-16fragment of 39 kDa is specifically proteolytically processed uponincubation in CD8(+) cell lysates and not in lysates from CD4(+) cells.It was investigated whether the recombinant 130 aa fragment would stillact as a substrate for this proteolytic activity. Indeed, incubation ofrIL-16His, which migrates as a 19 kDa protein in SDS gels, with CD8(+)cell lysate yields the same 17 kDa carboxyterminal fragment as seenpreviously in the 39 kDa precursor variant cleavage assays. No proteaseactivity is detectable during the incubation time without addition ofcell lysate. Therefore it is likely that naturally processed IL-16 issmaller than the originally suggested 130 amino acids. Preliminary laserdesorption mass spectroscopy data obtained with purified cleavageproducts indicate that proteolytic processing occurs at the aspartateresidue 510.

Example 6

Construction of the plasmid pCI/IL 16 PROM

A DNA fragment which contains the putative IL-16 promoter and furtherregulatory elements was amplified using the promoter Finder™ DNA WalkingKit (Clontech-laboratories, Palo Alto, Cat.# K1803-1). The kit mentionedabove contains 5 DNA libraries with adaptor-ligated human subgenomicfragments. A ca. 2.7 kb large fragment was amplified in a nested PCRwith 1 μl DNA from the Ssp I library as a template. In the first PCRcycle the adaptor primer AP1 from the kit and the gene-specific primerGSP1 were used. In the second cycle the adaptor primer AP2 from the kitand the gene-specific primer GSP2 were used. The gene-specific primerare reverse and complementary to sequences from Exon 2 according to SEQID NO:1.

Sequence of the primer GSP1:

(SEQ ID NO:2) CTTTTCGTCAAGTAGCTTCGTCTCACAG

Sequence of the primer GSP2:

(SEQ ID NO:3) GAAATCGAAGCGGCCGCGCTCCGTGCTCGCTGGCTAGGCATCTTGAG

The amplificates were digested with the restriction endonucleases Mu Iand Not I and cloned into the expression vector pCI (PromegaCorporation, Madison, Cat.# E1731). The constructed plasmid pCI/IL 16PROM (DSM 10603) is ca. 6.7 kb in size. Due to the design of thePromoter Finder kit the following nucleotides of the Promoter Finderadaptor are present in front of the 5′ end of the cloned subgenomicsequence: (SEQ ID NO:4) GGTCGACGGCCCGGGCTGGT.

Example 7

Selection of IL-16 regulator sequences

In order to determine whether a promoter sequence is present in SEQ IDNO:1 DNA fragments upstream of Exon I were cloned in front of a reportergene (e.g., luciferase, de Wet, J. R., et al., Mol. Cell. Biol. 7 (1987)725-737). When such a fragment exhibits promoter activity, there is anexpression of the reporter protein which can be detected by standardmethods (Luciferase Assay system from Promega, Cat.# E1500). Promoteractivity was identified in a fragment spanning nucleotides 2053-3195according to SEQ ID NO:1.

Example 8

Expression, purification of recombinant human IL-16

CHO cells are transformed, selected, fermented and lysed using anexpression plasmid which contains the genomic IL-16 gene according toSEQ ID NO:1 or the (mRNA analogous) cDNA sequence SEQ ID NO:6.

After centrifugation the supernatant is applied to a Q-Sepharose FFcolumn (45 ml; Pharmacia) which has previously been equilibrated with 20mM imidazole, pH 6.5. IL-16 is eluted with a gradient of 0 to 0.3 M NaClin 20 mM imidazole, pH 6.5. Fractions containing IL-16 were identifiedby means of SDS-PAGE and pooled. The identity of IL-16 was confirmed bymass analysis and automated N-terminal sequence analysis. In order todetermined the concentration the UV absorption of IL-16 at 280 nm and acalculated molar absorption coefficient of 5540 M⁻¹ cm⁻¹ at thiswavelength (Mack et al. (1992) Analyt. Biochem. 200, 74-80) are used.

The IL-16 obtained in this way had a purity of more than 95% in aSDS-PAGE under reducing conditions. The analytical Superdex 75 FPLCcolumn (Pharmacia) was eluted with mM Na-phosphate, 0.5 M NaCl, 10%glycerol, pH 7.0 and a flow rate of 1 ml/min. The amount of proteinapplied in a volume of 100 to 150 μl was 1.5 to 15 μg protein. Thedetection was at 220 mn.

A Vydac, Protein & Peptide C18, 4×180 mm column is used for the purityanalysis by means of RP-HPLC. It was eluted by a linear gradient of 0%to 80% B (solvent B: 90% acetonitrile in 0.1% TFA; solvent A: 0.1% TFAin H₂O) within 30 min. at a flow rate of 1 ml/min. It was detected at220 nm.

LIST OF REFERENCES

Baier, M., et al., Nature 378 (1995) 563

Cruikshank, W. W., et al., J. Immunol. 146 (1991) 2928-2934

Cruikshank, W. W., et al., Proc. Natl. Acad. Sci. U.S.A. 91 (1994)5109-5113

Davies et al., Ann. Rev. Microbiol. 32 (1978) 469

deWet, J. R. et al., Mol. Cell. Biol. 7 (1987) 725-737

Ennen, J., et al., Proc. Natl. Acad. Sci. U.S.A. 91 (1994) 7207-7211

European Patent No. 0 148 605

Patent Application No. PCT/EP96/01486

International Patent Application No. PCT/EP96/05661

International Patent Application No. PCT/EP96/05662

Kingsman, S. M., et al, Tibtech 5 (1987) 53-57

Mack et al., Analyt. Biochem. 200 (1992) 74-80

Sambrook et al., “Expression of cloned genes in E. coli” in MolecularCloning: A laboratory manual (1989), Cold Spring Harbor LaboratoryPress, New York, U.S.A.

WO 92/20702

WO 92/20703

WO 94/28134

Zhu, Y., et al., Clontechniques 11 (1996) 30-31

7 15936 base pairs nucleic acid double linear DNA exon 3100..3238 exon5540..6635 exon 7504..7672 exon 9711..9812 exon 12065..12323 exon12578..12703 exon 14767..15936 1 CTGGTACTTT CATGTACATT GTGTCAATTAAATCTTACAA CCACCTGATG GTGAGAGTCC 60 CAGTTGTGCA AAGAGGGAAC TGAGAAACTCGCTCAAGTTC ACATTGCTAG TGGGTCACAG 120 TTACCCACTA TGCAATGCTG AGTTTCCCATCCTTACCCAG AAGCTGTCTC CCCCATCACG 180 GAATGGCCCT GCAGGGGCCT TGGCCCTTCCCTTAAGCACA CCGTGGGCAG GTGGGAGGGG 240 GCCTCTGGAA ATCCCCTAAA ACAATCTACAGTGAGGGTTG GCAAGCTTCA GGAGGGGTAG 300 CGGTGGAGAC GGGATGTTGA GTAGGAGGGGTGAAAGTGAA GGAGGGAGGA GGAAGTCAGG 360 TTACTAAAAA GGAAGTGCAG TTTTGCAGAGCGCTGGCAGG ATGTGGCTGG TTAGCAACAC 420 ATGATGGAGT TATCAGATGG ATCTGTTCTCCCACCCCCTC TTTTAAGCCC ATATTTCATT 480 TTCCCTTGGG CCTGAGACCT ATGAGTCCAGAAGGGCAAAT CAGGAGCAGC CAGCTGAGAA 540 GCCAGTCATT TTGCCTTCCT CCTGGAGGCCCAGAAAAGGA GTGCAGTTGC CACAGCAGGT 600 CTGTCCCAGT GATGCCCTTG GCAGAGCCACATAGGGTCAG GGAATATTCC CGAGTAAGGC 660 GTCTGGAGAA GGAACTGGGG TGTCCTCAGGGAAGCCCAGG CAAGGAACTT CCCACAGGTC 720 ATCTTTCATG CCCTGTGTGC CTCAGCAGAATACAGGGCTC CCTCCTGTAC CTGCCCCCAA 780 CAGCACCGTC TCCTGGAGAC AAGGCCGTTCTGCAGCTGCC CTCCTCTGTT TGCTCTGGTC 840 TGTCCCACGT GGCCAGCAGA TCCTTCTCCCACAAACATTT CCATAAAAGC AATCAGCACG 900 ACATAATTTT ATTGGGCACT GGAGAGGCGGGGGTCACAGA GAAAGGATCA TAGCACGCAA 960 TAGAAAACAG GAAGCAATTT GCTTGAGGTCACTCTCAAAA TTTCCTTGAC CATTCCTGGG 1020 GATTTCTCAC TGGATGTTTT CTTCCGTGGCCACAGTGTCC ACTGTCTCAC TCTCCCACTT 1080 TTCCCTCCCT GCAGTCTTTT CTCTACAGGGTCCCAAGACT GTTGTCAAGC TGCTATGGAT 1140 GACCCCCAAG TCTTTCCCAT CTACTCTAATTCAAGGGGAC TTAACTCTCA CAACCAATAT 1200 GAACTTTGGG ATATTTTCTT CAACAAAAAACTAACATCCC TCTTATAAAA AATCAGCCTA 1260 AACTCTCCCC CGTGCTTTAA AAAACTTGCTTAAAAAAACA CAACAGGATT TTCGAAGAAT 1320 CCTTTCTTAG AAAACAAACA AAAAACCAAACAAAAACGTA CTTTCTCCCC ACTATGCTGA 1380 ATCTGCTCTC TCCCTCTTTC CTTCTTTCTCTTCTCCCTCC CTCATCCACC AAGCATGGTT 1440 CCAGGAAACA CTAGTAAGAG AAAAATTTGTGTAAAATCAA ATTTATTTCA GGGGCTAATT 1500 CTGAAGTCTC CCATGAAGAG GATGCAATATTGGATGTGTT GATGAAAAGA CAAAACACAT 1560 TTGTGCTGTT CTAAGTTGTA AAGCTGCATTGACTGCAGTT GGTATCACAT GTGGCTGCCT 1620 GTAAAAGAGC TAACAATCCA TGAATGTCAACAGATGTCAA CTTACAGAGC TCCCAACCAG 1680 GTGAAATTAA ATTTCATCCC CATGTAATTTCTCTTGTGGA CAAGAGACTT AGAGATCATC 1740 AGGCCAATAT TTGGAGCTTC TAATGCCATGACTCAGCCTT ACTTTTTTTA AGGGCTAGTT 1800 CGAGAAGCTC AATGATTCCT TTGAAATTGGTTGATCTCTC AGTATTTCAT AATGCTCTCC 1860 TAAAGCTCAG TTCTACAGTA GGGAACTCGAGCTGAGGCAA TGCTCTGTGA ATACACTTCT 1920 AACTTTTGTA GACCTTTGCT TCCTCCAAAATGTTTATTGT CAATGTAGAT CTTAAATTTC 1980 AGGTCATGGA TATTTGCCAT TGCTTTTTTAGTCCCAAAGG ATGCATTTGT TCCTTCCTCT 2040 TTCCTTCCTC TCTGGAAGGA AGCCTGCAGCCATGTCAAAT GGCCCATCAT TACACAAATC 2100 TGGAATGAAC CGGACGTGGA GGCCACCGCAGTATAGCAAA GCATTTCTCT TGCAGACAAT 2160 CAAATACAGG TGGCATCATT TTGATGATGTCTGCTTTCTC AATCTTTTTT TCCAGATGGG 2220 TATTATGTTT GGTTGGGCTA TGTCTCTGGAACACTATTTT TCTGCCTCTC TGCCTGCAAA 2280 ATTCTTACTC TGTTTATGAA GCAGACAGAAAGTACTTCTC TAGGAAGCCT TCTCCACCTC 2340 TGGGATGGAA CTGGTCTCTC TGTTTCCTCTGGGCCCTGAC AGTACCTTAA GCACACTGTG 2400 TGATAATCCA AACTCTTCAA GTATTGCATAGCTATTCAGA GCCTGGTTCA TCTTGTCCCT 2460 AGGCTGTTTT GGGAGGACCC CAAGGCTGCATCCAGGCTCA GTGGTGGAAA CAATTCTGCA 2520 ACAGATTGAG GATGTCACCC TAGGTGCCACAGAGGGGAGC AATGGCACAG GTGCTGAGTA 2580 TCTGTCATCT TTTGCAGAAC ACTTACTAAATATTTGAAGA GTGCTTAACA AGTGACAGCA 2640 ATTGACAAGC ATTTTCCTGA GTGCTCTAACCTTAGGGAAC ACTAGCTGCC AGGTTTTTTT 2700 GCTGCCTTGT TTCTACGCTG CCCTATACCCTCCCGGGGTG GGTAGTTCAC CAGATGCAGT 2760 TGTAACTGAA GCAATGCCAG TCCCTCCACACTCAAAGCCT TTTGTTCCTA TCATAAAGAG 2820 TCAGGGTCTT CAAGTATGTC AACAGTATGAGCCCATGACC TGGATACCAG GTTGCTGGAA 2880 GTCAAGGGGT GCTCTGCATG TCTAGGAGGTGCTGGCTCTG CCACGTCAGC ACCAAGGAGA 2940 AAAGAATCCT CTTTACATGG CTGCTGATGAACTTCTCACT GAAAGCAGCT CAATATCCGT 3000 TTTTCCGTCC CAATCAAAGC GTGTCTCGCCTTCTCACAGC TTGAGGCTAC CGTTTTGACA 3060 TGGTCTCGCT TCCTGTTTAC ACCACAGGAAGCGAGAGAGC TGCTGCCACT GCTGCTACCA 3120 CAGGAAGACA CAGCAGGGAG AAGCCCTAGTGCCTCTGCCG GCTGCCCAGG ACCTGGTATC 3180 GGCCCACAGA CCAAGTCCTC CACAGAGGGCGAGCCAGGGT GGAGAAGAGC CAGCCCAGGT 3240 AAGCTTTCAT TGAGATCTTC CAAAAGAAAGGGTCTTTTGA AAAAAGGTGC AGGGATAAGA 3300 TAAGAGCACA AATTGGCCTG AGGATCAGAGTGCTCTGCTT TGACATCACC ACTGGACCTG 3360 GCTGATCAAC AGTCAAGGGT TCCAGGTGCTGGGCAGCAGC ACCGTGGAGG TGCTGTCCAG 3420 GGTGGGGAGC CTCCTGGGTG GACGGGTGATGCTGTGGGGG TGGCAGCAGG AATGCAGGCC 3480 ATTCTGGATA ATTGGGAGGA TGGAGCCCTTGGAAGAGGTC CAAAGACACA CCCACCCAGC 3540 CATCCTGGGC AGCTACCTCT AGGGTGCCAACATTCCAGTT GCAACAGTCC TGCTGCATTC 3600 AGGATTTCTG AGTCAGAGAA TGCAACGCCAGAAAAGTATA AATAGCATCT ACCAGTGGTC 3660 ACCTCCTCCA TGAGGTGGAT AGCAAGGGTGTGTCCACATG GTACACCTGA CAATGCTAAG 3720 ACATGACTTG AATATTTATT TGCTTGCTAGAGAGAGCATA GGACTTGAAG TCAGGGAATC 3780 CTGGATCTGT CTGTCATTAG TATGATTCTGAGCAACCTAC TTTGCCTTTC TGAGCCCAGA 3840 TTCACTCATC AGCAATGTGT GGAAGAGCACGTAGACCCCA AGCATATGAT GCATTCAGCA 3900 AGCACACATT GAATGCTGGC TTCGTGAGGCCCTGTGCTAG GTGCTGGGGG GAGAGAGAGA 3960 AAGGGATGAG ACCAGGTAGT CAGGTCCTGCCTCCAAGGAG CAACTAGTAG TGGCAGGAAA 4020 TAAACATATA GACTATAAAA CATATGACTGTTTGAGTAAC ACTACAGAAA AGCTCATGAA 4080 AGACTTTTGG AGGGCTAAGG GACATTTTGGAAAGATCTAG CACATCTCAC CCAGAGCAGA 4140 GGCCCTGTGG GTGGAGGCTC CTCTCCCTTAGTCACCCTAA TTGAGGAACA TTCTAGCAAA 4200 AGCAAACGCC TCAGTGTTTA ACTGACAGGAGGTTGTCACT CCAATCTGCA AAGGGCTTGC 4260 CTGTGGAACT TCTGCCTCCT GATTCTCACAACAGCACCTT ACAGCCAAGC CATTCAAAAA 4320 TGCAGAAACA GGCTGGAGGC CTGGGCTCACTTGCCCAAGG CCAGTCTCTT AGATTGTGCA 4380 GAATTTCTCT TTGATATCAT CAAGGTGTAATGCTCCATGA TTCACTTCTT TTGAAACCTG 4440 GCATTGAGAC AAGGGACAGG AGGGATCAGAGTTCCTTCAA CTGGGTTGCG TTCCAGTAGC 4500 AAGCATCCCC CAAGGCACAC AGGCCAGCCTCCCTCTGCCC CTGGGAAAGA GACCAGGACA 4560 CCCTCTCCCT TTACCCATGC AGACATGATGCTGGTTCAAT GCTGGCTTCT GAGAAAGACT 4620 CCTATGTGCT CCAGGGCATG CCTGAGGTCCTGGCTGGCAC AGAGCAGGTT GCACATTGCA 4680 ATCCCCTGCT CATCACATTC CCAACACTCAGGTTGCATCC CAGGTATCTT CAGTCAGTAC 4740 CTAGGGGTGG GCGTCAGTAT TTTTCAGGGCCCTCCAGATT CCAGTATGTA GCCAAGGTGG 4800 AGAATCCTAC TTCACAGATT CCTTTCTACCTGGAACCTTT TCATCAGCTT TTGAGGGAGG 4860 GAAAACACTC CTTTGCCGAG GGCAAGCTGATCAATGACCT GTGTATAGGC AGAGCAGCAA 4920 ACACACGGCT TCAGGGCCAG GCAGGTACATACATGGGAAA TGCTGGCTGG GTGAGAGGGA 4980 GCGTGAAGAG CTGTGGGAAG CCGAAGTGGCCCCATCAGAA GCTGTGCACA GGCACCTTGT 5040 TTTTAATGAC ACGGGTAGGT CAAAGCACAAACAGCTGCCA ACTCATGACC TTTGTCTTAA 5100 AAGTTTAAAC GGCAGGAGAA CTGCTTTGGCTTTTACACAT TTAACATGGT ATCTTGGAGG 5160 CTCCTTAGTG CAGTAGAAAG GACATGAACTTCAAGAGTCA GGAGACACAG GGTCTTGCCT 5220 GAGCACTGCC ATCAGATGGC CCCTTACCCTTCTTGAAATG TAATATGCCA GAGGTCGGCC 5280 CAGATATTCT CTGAGCACCC TTCCAGGCCTAAAACACTAG GATACTGTGA GATTAACTCC 5340 TACTTCTGGT CCTTCACTCC TGCCTGTTGGCAGCTCAGTC AGGTAATAGC ACCTGGAGTT 5400 CACCCACCTG GGTGTCCCCC ACTTCTGCTAATCTCCTCCT CTTGAATCCT TCTTGCTGTT 5460 CAGCTTGGAA ACTAGAATTT AGGAAGAAAAGTCACTGTAT GATGTAATGC ACAGCTTTGG 5520 CCTTGTTTCT GCACAGTAGT GACCCAAACATCCCCGATAA AACACCCACT GCTTAAGAGG 5580 CAGGCTCGGA TGGACTATAG CTTTGATACCACAGCCGAAG ACCCTTGGGT TAGGATTTCT 5640 GACTGCATCA AAAACTTATT TAGCCCCATCATGAGTGAGA ACCATGGCCA CATGCCTCTA 5700 CAGCCCAATG CCAGCCTGAA TGAAGAAGAAGGGACACAGG GCCACCCAGA TGGGACCCCA 5760 CCAAAGCTGG ACACCGCCAA TGGCACTCCCAAAGTTTACA AGTCAGCAGA CAGCAGCACT 5820 GTGAAGAAAG GTCCTCCTGT GGCTCCCAAGCCAGCCTGGT TTCGCCAAAG CTTGAAAGGT 5880 TTGAGGAATC GTGCTTCAGA CCCAAGAGGGCTCCCTGATC CTGCCTTGTC CACCCAGCCA 5940 GCACCTGCTT CCAGGGAGCA CCTAGGATCACACATCCGGG CCTCCTCCTC CTCCTCCTCC 6000 ATCAGGCAGA GAATCAGCTC CTTTGAAACCTTTGGCTCCT CTCAACTGCC TGACAAAGGA 6060 GCCCAGAGAC TGAGCCTCCA GCCCTCCTCTGGGGAGGCAG CAAAACCTCT TGGGAAGCAT 6120 GAGGAAGGAC GGTTTTCTGG ACTCTTGGGGCGAGGGGCTG CACCCACTCT TGTGCCCCAG 6180 CAGCCTGAGC AAGTACTGTC CTCGGGGTCCCCTGCAGCCT CCGAGGCCAG AGACCCAGGC 6240 GTGTCTGAGT CCCCTCCCCC AGGGCGGCAGCCCAATCAGA AAACTCTCCC CCCTGGCCCG 6300 GACCCGCTCC TAAGGCTGCT GTCAACACAGGCTGAGGAAT CTCAAGGCCC AGTGCTCAAG 6360 ATGCCTAGCC AGCGAGCACG GAGCTTCCCCCTGACCAGGT CCCAGTCCTG TGAGACGAAG 6420 CTACTTGACG AAAAGACCAG CAAACTCTATTCTATCAGCA GCCAAGTGTC ATCGGCTGTC 6480 ATGAAATCCT TGCTGTGCCT TCCATCTTCTATCTCCTGTG CCCAGACTCC CTGCATCCCC 6540 AAGGAAGGGG CATCTCCAAC ATCATCATCCAACGAAGACT CAGCTGCAAA TGGTTCTGCT 6600 GAAACATCTG CCTTGGACAC AGGGTTCTCGCTCAAGTGAG TTTCTACACC CGGTGTTTCT 6660 CTTTACCTTT CTCATCTTTT TCTTTCTCATCTTTATTTTT AAAAATAATC CTATATATAA 6720 TTTAAAAAAT TCCCAGATAT ATTGATTAAAGAATTGTTCT GCCTCTTTCT TTCCATGTGT 6780 GTGCAGATGT CTGAGTGTGT GTGTGTCTGTCTGTAGGTAT TACACCTCTG CCTTTCACAT 6840 TAAGGAGGAG TTTTCACAAC ATCTGGCTTCAGGAGGGCTG GGAGGTAGGA GGTGGGACTG 6900 GCTCCCTGGT GAATTGCTCA TGAGGGCTGACATACGCCTG TGGAGATTTG GAAGGTTGAT 6960 GCACATCTGA AATGTCCTGC GGTTACTCAGAAAGACCAGA ATGAGGCCAG GAAATTATCC 7020 ATCAGGAATT CTTACTCTCC AAATGGAATCCACTTGTACT CTGCACGTGG GTTCAACTCC 7080 CTCATCAGGG AGTTAGGATG TCTGGGTCCTAGTCTCAGCT TAGGCACTGA TTCTGACTAT 7140 GAGCAGGTTC TTTCCATGCT CACCCTCAGATTCCTTGTCA GTTGAAATTA GGAGATGGAT 7200 GAGACCTTCT ATGCAGAACC AAGAGGATGTCAGACGTGCC ATAGGGTCCC TGCTGTAGGG 7260 CTGGGGCTTG GTCTTCCCTC TGATCAAAGTAGCTCTGCAT TTATTAGTTT TATTTATTAT 7320 TCTTACACTG CTGGGAAATA TCTGTAGAGTGAAGGTATGC TAGTATCTAC TCATAGATTT 7380 GTTGCATCAA ATAATATGCA CATAAGTGCTTGGCACCACA CCTGGGACAT AGTAATTATA 7440 CAATCACTGT TACCTCTTTT TAATATTGTTGTTCATACTG TGTGTTGTTT CTCCTTATGA 7500 AAGCCTTTCA GAGCTGAGAG AATATACAGAGGGTCTCACG GAAGCCAAGG AAGACGATGA 7560 TGGGGACCAC AGTTCCCTTC AGTCTGGTCAGTCCGTTATC TCCCTGCTGA GCTCAGAAGA 7620 ATTAAAAAAA CTCATCGAGG AGGTGAAGGTTCTGGATGAA GCAACATTAA AGGTAGGTTT 7680 CCTTTGTAAG CATCTGCAGT AACCAATGGCTTATTATGGC TGTGTGGCCA CCTTAGTTGG 7740 GCCAGAGGGG AAGTAGCTTG AGTAGCCTGCCACATCAGAC CCAGGTTGCG TCCTGTGATG 7800 GTGGGACACT GTAGCACTTG ACCACAGTAAGACCTTCCAT TTGAAGAGAG CCTTTTAGCT 7860 TGTGAACCAC TTTCAGTAGA TTGACTTCTTGCATCTTCTT TTGTCATTTT ATAAATGAGA 7920 AAGGTAAGGC TCAATCAAGG CTACAGAACCTGGGTTTTCT GTCTCCAAGT TCAAGTTCAG 7980 TGCTGTTTCC ACATTCCATG TGCTGCTGTCCTGGCATGTG TCTGTTGTGG GATGCTGTCC 8040 ATTGTAAACA ATGTGGGTTA CAAGAGCTCTCACCTGGAGC TTTCATTATT TCCACTGTGC 8100 ATGGAGAGGT GGCTGATCCC AGGGCTCACAAGTCCCCCAC GCTTCAGTCA AGTCATTCTG 8160 AAAGTCTCAC TTCCCATATG TTTTCTGAGCATGACCCAAA GGGGTGTGGG GAGGAAGTGG 8220 CCAGGCTGAG CTGGGGCCAG CAGTCAAATGAGCTCAGGCT CATGGGTCCT GCACCCTCTA 8280 GGTGCTGCCC CAGGCCTCCG TAGGCTTTTGGCACTAGAAT GATCCAGGCT AGGATGAAGA 8340 GGATAAGGAG GTTCTCGTTT TCCATACAAGGAGGCCTCAT AGCTGCAATT TCCACATCAA 8400 GAGTGTAGGT GAGTCTGATG AGCCCAAGGTGCTGCTGTGC TGAGATTCTT TCGGCTGTGG 8460 CTTTCACTTG TCACCTGGGA CCATCATCCCCCAGGATCTT ACTCAGTGCA AATGAAATAA 8520 CAGAGGCAGA GCGTGTAAAA CACAAAGAGCCATCCTGCCT GAGCTGCTCT GGGGAGAGTA 8580 TTTGCTTTCT AACATGAGAA GAGCCTTCTACAAGGCAAGT AACCTGATAC TTGGGTAAAA 8640 GTTGAGAGAG TTGGGCTAGT GTTGGGGCTTGGAGGTGAGG GTGCAGTGAG GTACATTCAT 8700 CCTTCCATGC CTTTGGGTCT TAGGGGCTCCAAGTCTTAGG ATCATAGGGA CAGCTGGAAG 8760 TCAGGTGCTC TAGTGACGCT GAGCAAGTGAATTCTTTGAC ATAAGTTTAC TCCTTAGTGC 8820 CAAGGTACAA ACAGGTGCCC CAAGAACCTGTAGGTTTACT TTATTTGGTC TGCATGGTGA 8880 TGAAAAAAAT ATTGAATTCT ATACATGATAAAACCTGAAT TGAAACCTGG ACTTTAGGGA 8940 AGTGATCTGG TAGCGCTAGT TCTGTATGCCTACGTAGAGC TGACCCTTTG AGCAGATGTA 9000 CTCGTTGGCG CTCTGCAGTT CCTATCACACATCTGCCCAT TTGGCTCATT TTAGGGACCT 9060 GCCTGACTCC TATAGGCATC TGAGTTTGAGACCCCTGCTC TAGACTGGAA TAGGAGTCTC 9120 TGACTGTGTC CTGGCTCCAT GGGAGTCCCCGTCTAGGCTA GGAAGTACCG TAGTAATGTG 9180 TGTGTGTGTG TGTGTGTGTG TGTGTGTGTCACACTTGCAC ACTGTGCATT GGGGCAGGAT 9240 GTTAGCTGGG CTTCCTTAGT GCTGCTGCTGTGACCCATGG AGAAGTAGAA GGGAAGAAGG 9300 AGCAACCAAT TCCTGCAGAA CAGCACTGACCCCTGTTTTG TTTTTTGTTG TTTTTTTTTG 9360 TTTTGTTTTA CCTGAAGTCC TACAACCTGACTTCATCTCA CACTGTCCAA TATGCTGATT 9420 TCTGGCTGAC TTCATGGCAC TCCCCCTGCCCGGCTGTGGA CAGGGTGAAT GAGAGAGGAA 9480 AATAATTATG CTTGCTGCTT TACATACATTTTTTTTTTCT TCTAAGCTTC CCATGACTCC 9540 TGAAGGTCCA TTCTTTACAG ATGAGGAAACTGAGGTTTGA GGAGGTGATG TAACTTGGAG 9600 GCTGGCCAAG CTGGGGTTTG AGATAACAAATCAGTCTGAT GTCAGTCCGA TGTTAAATTG 9660 TTCATCCTCT TGCAGTAAAA TGTTTTTGGATGTATGTATT TCCTCTGCAG CAATTAGACG 9720 GCATCCATGT CACCATCTTA CACAAGGAGGAAGGTGCTGG TCTTGGGTTC AGCTTGGCAG 9780 GAGGAGCAGA TCTAGAAAAC AAGGTGATTACGGTGAGTGG CCAAGTGAAG GGGCATGTCA 9840 CAGCCAGAGG CAATGGTTCT GGGGGAGGGGGACACACTTG CCAGGAAGGG GCCCTGTGCT 9900 GGGGAAATGA AGAATGCATG ACACTAGGCCACTGGGCAGG TCCTGTCCAC TCAGCACATC 9960 CCAGAGCCTG GGGCTGCGTG GAGAGGGTAGCAGGCCTGGC CATGGGCATC TTTTCCTGTG 10020 GGTCCCACTA TTCTGGCTCA TCCAATCTGATCAGCATTGG CTGCTGCCTT CAGGTCACCT 10080 GTACCTGACC CAGATGGTTT CTGGTTCTGCCAGTTTTGTG GAGCCATGCT GCGGCTGCTC 10140 GCTCTCTAAA GCCGAGTGCA TTGCTGTCATCCCAGGGCTG TGTTGTCTCA GGGTATCCTT 10200 TGTGTAGGCT GTGCTGGGCT CATTTGAATTTCCATGCCCA ACTGAAAACA AATCCTCCAG 10260 TTCACAGCAT CAGCCAGCAT TCAACATACCACACCCCCTT GCAGTGGCAA TCTGGCATGT 10320 TCCTGCGTGT CACTTCAGAG TCAATCATGTCAGTGGTGAC TTCCTTGATT TCCTGATAAG 10380 TTTTCTATCA CATAAAAAAC ACTTAAAACCGGTAAGTCTC TATTTCTCTC ACTGAGTGCA 10440 GCTGAGTATT ACAAAAAGAT TCCTGACCGTGTAGTTTACT TTCTACTTGA AGAGGAGGAA 10500 AGAGAGCTTG CCTGTGGGAA TGGCACTTTGGGTATTTTTC TCTGTCCATG AGTAGCAACT 10560 TCTGTCCACG TCATCTGGCC AGTCACCCTTGAGACACTGC AGACAACAGG AAAATAGGAG 10620 GAAGGCGCAC ATGTTGGCTG GGCACATGCACAAAAGTTCT TTCTCCTTCT GTGTTTGAGC 10680 ATTTCTCTTC CTTTCCAGAT GATTAGAAGGGAACTAACGT AGAGCACCAT CCACGGCCAT 10740 GCTGAGCACT CACTGACCTG TTGTCTAAATTCATCCTACC CACACTTTGG ATTGATAAAT 10800 TGGTGGCATT TATGCTTCTT TTATAGAGGAGGAACCTGAG GTTCCTCCTT TAATTAACTT 10860 TAATCTTCGT TGGCCTAAAA ATCCTCTAGCATTGAGGAAC CTGAGGCTCA GAGGAGAGGT 10920 TCCGCTCCAT GCTGGATGTT ACAGCCTGGGGATTCTAACC ATGTAACAGA ATTTTTCTAA 10980 CACCAAAACA ATTTAGAGAA GCAAAGAGCTTTGCTCCTAT CATAAAAGCA AAACTACAGG 11040 TACCACATTT TAGTGGTTTC CATGCATGCTATCACTCAGT CCTCTTAATT ATAATGGACC 11100 TCATTAAAGA GGCTGAGGCA GAGACATGAGATATTTTTGT GTGTTTGTTT ATCCCACATA 11160 TCTTGCAGAA AGGGGACCAA GAGGTGACTGGAGGTAAAGA GTCAGAATTT CTAGGGGAGG 11220 AGCTATAAAA ATGTCTAGAC TGCCTAGGAGAGTGTTTCTC AAAATGCATT CTGCCAAATG 11280 AGACCGAGAA TTTCTCTATG AGAAAAGAATTCTTTGTTGA AACACTTTGG GAATCCCCAT 11340 AATACCCTGC CAACTTAGAA ATCTGCCATGCAGATTTGCA TTGTAGACCC TCTAAATGCC 11400 TTTTGCTTAG AAATCTGCTG CCATGCAAATTTGCATTGTA GACCCTCTAA TGCCTTTTGC 11460 TTTTTAAAAG TAGACAATGT CAGAGCTTTTGTTTCACCCA GTATTCCTCA AGTTTCTTTG 11520 ATTTTAAAAA ATTTTTCTTG GCCGGGTGTGGTGGCTCATA CCTGTAATTC CAACACTGTG 11580 CGGGGCCAAG GTGAGAGGAT TGCTTGAGCCCAGGAGTTTG AGACCATCCT GGGCAACACT 11640 GGGAGACCTG TCTCTATTTT TTAAAAAATAATAGGAAAAG CCTTTTTCTT ACACAATACC 11700 CGTTAATATG CCATAGAATC AGTGCCTTGAGAAAACTACT CTGGGGACTT CTGACCTAGG 11760 GCAGGTGAAG CAAAAGATTT TATATGGAATCCCAACTAGA ATCGTGGTGG TACACTATAG 11820 GACGTTGTGT TGGGATGGAT TCTGAGGGCTTACCTGGTCA TTACTGCTGG TGATCTCTGC 11880 TCTGGATGGA GAAGGAGGGA ATGCTGGCCTCTGTGCCAGC AGCTCCAATC TAGGACACAA 11940 TTATCTTTAA TCTTTGTTGG CCTAAAAATCCTCTAGCATT GACTAACCGG TTCAATCCTC 12000 CTCCAGCAAG TATGTGGACT GGACTTGTGTGATTTCTGGT CCTGACTTCC TTTGGTTTGC 12060 TCAGGTTCAC AGAGTGTTTC CAAATGGGCTGGCCTCCCAG GAAGGGACTA TTCAGAAGGG 12120 CAATGAGGTT CTTTCCATCA ACGGCAAGTCTCTCAAGGGG ACCACGCACC ATGATGCCTT 12180 GGCAATCCTC CGCCAAGCTC GAGAGCCCAGGCAAGCTGTG ATTGTCACAA GGAAGCTGAC 12240 TCCAGAGGCC ATGCCCGACC TCAACTCCTCCACTGACTCT GCAGCCTCAG CCTCTGCAGC 12300 CAGTGATGTT TCTGTAGAAT CTAGTAAGTTCTCCCAACTC AGTGGAAGCC ACATGGGCCA 12360 CATCCTCTTT GGCCATTTGG GGCCAGACCTGATGGGGCTA CTCAGTAATT TGTGACCCCA 12420 AGAATGTGTG GCTGCCTAGT ACACTGCCTGAGACGTGTTT ACATGTGCCT GTGTGCAAAC 12480 ACGGGGGCTG TATCACCCCG GGCTCACTTGAAGCCCAGGG CATCTGTGGC CTGGGGAGAG 12540 GAGAGGATCC CTAACAGAGA CCTTGTGTTTTTCTCAGCAG CAGAGGCCAC AGTCTGCACG 12600 GTGACACTGG AGAAGATGTC GGCAGGGCTGGGCTTCAGCC TGGAAGGAGG GAAGGGCTCC 12660 CTACACGGAG ACAAGCCTCT CACCATTAACAGGATTTTCA AAGGTGTGGG GTGTGTCTGG 12720 TTCTTTGCGT GCTCTCCAGT TGTGGGCATGTGGCCAGGCC CCCAAAAGGC TTCTGGGCAC 12780 TTTCTGGGCT ATGTTGTTTC CCACAACTCCATGTCCTCTT CATAGGCATG CTGGTCCTTT 12840 TAGGGCTCAA TTCTGCTTTT TCTACTTTTTCTCCTTTGCT CAGACATCCC CTCAATCCCC 12900 CCTCTGTTTT GATGGGTCTT CAAAAATACCTAAGTCCTGG GCTTGGTTCG GGTTGGCAGG 12960 GCCAGGACTC TAGAGTGGGG CAGTGAGGCACTGGCCTGTG GGGCAGAATT TTAAAGGGGT 13020 GCCAAAACAC TCAGTAACTC AGATCGATACTATTTTAATG CAGCGTGTTT TTTAAAATTA 13080 ATTTTAAAAA AACATGTTGG GACAAAATATCCAAGTTTTA AATCAAGACA GAGTCTGACT 13140 TTGTACTGCA CACTTGGCCT CATTTGCCTTACCCTAGTCC TGGACACGTC AGCTCCTGCC 13200 TTTATTTAAA ATGTTGATAG ATATTTTGTTCATCAGGGAT TGGAGTACAA ACCAGTCTGA 13260 TATGGGGGTC ACTTGGATTT CCCTGTGAAAATCATGAATG ACTGTGGCTA CCATGTAAAA 13320 CCATCCCTGA TTCTTTGGTG TTCCTCAAATTGGAGGTCTC CAAGCCACAG AGCAAGGGGT 13380 TGTAGAGAGA GGAGTACTGG ACAGGGAGGCAGGTGGGCCA GGTTCTAGTC CCAGCTCTGC 13440 CTGTAATGTG CTGAGTGACC CTTCCCTTCTGGGCCTGAGG CTCTTCATCC ATAAAAGGAG 13500 GTAAAGAGGT ACAGGTGTGT GTCTGAGGGCTCTTAGGACT GAGACCCAAA GGGACTCTTA 13560 GCTCTGTCCC TCACCCACTA TGAGCTCCTGCTGCTGACTG GTTTCGTTAG AGGAAGTTCT 13620 GGCTGCGGCT GCAGAAACCC AGAAGGTAGAGTGAGGCTTA CATGGCATTC CCCCCAGAAT 13680 CCATGTTAAC CCCAATTCTG GGAAAGATATTTCTAATTTT TGAAGGTCAA TTTGGAAGGA 13740 GCATTGGGTT CAATGTCAAG AGGACTAGATTCCAGTGTTA GTTCTGCCAC ATGACCTTGG 13800 TAGCATGATC TTGGACAAGT CACTTCACCACCATGGGCCC ATTTGCTTAA ATGTTTAGGA 13860 TGAGACTGCC AGCTGCAGGG TGATGTTGGAAGGAGAGATG CAGATTCTGG AGCCAGAAAG 13920 TCTGGGTTCC AGCCCAGGGC CCACCACTAGCAGCTATGAA GCTCTGGGCC AGTTACTTGA 13980 GTTCTTGGTT TCCTCAGCTG TTAAAAGGAAACACAAATAA TACACCCCTC ATAGGATTAC 14040 TGTCATAAAT GCAAAACATT AGCACAACGCCTGTTAAAAT AATTGCCCAA TACACTTTAG 14100 CTATATTTTC ATTACTATCA TTAGTATTATCTTCTACTCT TATCAGGATT TGTGAAGATC 14160 AACTGTGTCA AATGGATGGG AAATTTTATTTTAATATAAA CAGTAAAATA GCATTGTTTT 14220 CACTTGCAGC TTTGAAATAG TGGGGGCCATATATGGTTGT TTCCTTTTTT ATGTGGACAC 14280 AGAGGACTTC GTGCCAGAGG CAAGATCCCTGTAAATATTG TTGCACAAAA ATCTCACTAG 14340 CTCTCTTCCC ATACCACCCA ATGCTGATGTCCTCACCACA TGCGGAGAAC AAATGTGAAG 14400 GGAGTAGGAT ATTGGGTCAG TGTCCAAAGCAGGGTCTGGG CAGGACTCAG CTCCCCAGAG 14460 TCCTCTATGA ACTATGGACG GTGCTCCAGGCAGGCTAAGG CGTGGAGCTG CCTGATATTT 14520 CCCTCCCCTG GGGACAGCAA GGGCTATCCCTTTCCAAAGG CCATGGAGAG CTGGAGCCTG 14580 GTGCCCTAAC TTTTGAGTCA CCATCTTAAGAGATGCCTCA TTTTAGAACC ACCAACAAGC 14640 AAGCTCCCAA GGGATGGTGC CCTGTTCTCTACCAAGCTAT CCTGGCTCTT TGGAGATCAA 14700 GGAGAGGAGG CAACTTTCCT TGTTCCCCATCATCTGTGGA ACCCATTACC TTCTCCCTCA 14760 TTTCAGGAGC AGCCTCAGAA CAAAGTGAGACAGTCCAGCC TGGAGATGAA ATCTTGCAGC 14820 TGGGTGGCAC TGCCATGCAG GGCCTCACACGGTTTGAAGC CTGGAACATC ATCAAGGCAC 14880 TGCCTGATGG ACCTGTCACG ATTGTCATCAGGAGAAAAAG CCTCCAGTCC AAGGAAACCA 14940 CAGCTGCTGG AGACTCCTAG GCAGGACATGCTGAAGCCAA AGCCAATAAC ACACAGCTAA 15000 CACACAGCTC CCATAACCGC TGATTCTCAGGGTCTCTGCT GCCGCCCCAC CCAGATGGGG 15060 GAAAGCACAG GTGGGCTTCC CAGTGGCTGCTGCCCAGGCC CAGACCTTCT AGGACGCCAC 15120 CCAGCAAAAG GTTGTTCCTA AAATAAGGGCAGAGTCACAC GGGGGCAGCT GATACAAATT 15180 GCAGACTGTG TAAAAAGAGA GCTTAATGATAATATTGTGG TGCCACAAAT AAAATGGATT 15240 TATTAGAATT TCATATGACA TTCATGCCTGGCTTCGCAAA ATGTTTCAAG TACTGTAACT 15300 GTGTCATGAT TCACCCCCAA ACAGTGACATTTATTTTTCT CATGAATCTG CAATGTGGGC 15360 AGAGATTGGA ATGGGCAGCT CATCTCTGTCCCACTTGGCA TCAGCTGGCG TCATGCAAAG 15420 TCATGCAAAG GCTGGGACCA CGTGAGATCATTCACTCATA CATCTGGCCG TTGATGTTGG 15480 CTGGGAACTC ACCTGGGGCT GCTGGCCTGAATGCTTATAG GTGGCCTCTC CTTGTGGCCT 15540 GGCCTCCTCA CAACATGGTG TCTGGATTCCCAGGATGAGC ATCCCAGGAT CGCAAGAGCC 15600 ACGTAGAAGC TGCATCTTGT TTATACCTTTGCCTTGGAAG TTGCATGGCA TCACCTCCAC 15660 CATACTCCAT CAGTTAGAGC TGACACAAACCTGCCTGGGT TTAAGGGGAG AGGAAATATT 15720 GCTGGGGTCA TTTATGAAAA ATACAGTTTGTCACATGAAA CATTTGCAAA ATTGTTTTTG 15780 GTTGGATTGG AGAAGTAATC CTAGGGAAGGGTGGTGGAGC CAGTAAACAG AGGAGTACAG 15840 GTGAAGCACC AAGCTCAAAG CGTGGACAGGTGTGCCGACA GAAGGAACCA GCGTGTATAT 15900 GAGGGTATCA AATAAAATTG CTACTACTTACCTACC 15936 28 base pairs nucleic acid single linear other nucleic acid/desc = “Primer” 2 CTTTTCGTCA AGTAGCTTCG TCTCACAG 28 47 base pairsnucleic acid single linear other nucleic acid /desc = “Primer” 3GAAATCGAAG CGGCCGCGCT CCGTGCTCGC TGGCTAGGCA TCTTGAG 47 20 base pairsnucleic acid single linear other nucleic acid /desc = “Primer” 4GGTCGACGGC CCGGGCTGGT 20 9096 base pairs nucleic acid double linear DNAexon 1..338 intron 339..663 exon 664..832 intron 833..2870 exon2871..2972 intron 2973..5224 exon 5225..5483 intron 5484..5737 exon5738..5863 intron 5864..7926 exon 7927..9096 - 356 /product= “N meansbetween 1 - about 6 bp” 5 CCGGACCCGC TCCTAAGGCT GCTGTCAACA CAGGCTGAGGAATCTCAAGG CCCAGTGCTC 60 AAGATGCCTA GCCAGCGAGC ACGGAGCTTC CCCCTGACCAGGTCCCAGTC CTGTGAGACG 120 AAGCTACTTG ACGAAAAGAC CAGCAAACTC TATTCTATCAGCAGCCAAGT GTCATCGGCT 180 GTCATGAAAT CCTTGCTGTG CCTTCCATCT TCTATCTCCTGTGCCCAGAC TCCCTGCATC 240 CCCAAGGAAG GGGCATCTCC AACATCATCA TCCAACGAAGACTCAGCTGC AAATGGTTCT 300 GCTGAAACAT CTGCCTTGGA CACAGGGTTC TCGCTCAAGTGAGTTTCTAC ACCCGNGGAT 360 GAGACCTTCT ATGCAGAACC AAGAGGATGT CAGACGTGCCATAGGGTCCC TGCTGTAGGG 420 CTGGGGCTTG GTCTTCCCTC TGATCAAAGT AGCTCTGCATTTATTAGTTT TATTTATTAT 480 TCTTACACTG CTGGGAAATA TCTGTAGAGT GAAGGTATGCTAGTATCTAC TCATAGATTT 540 GTTGCATCAA ATAATATGCA CATAAGTGCT TGGCACCACACCTGGGACAT AGTAATTATA 600 CAATCACTGT TACCTCTTTT TAATATTGTT GTTCATACTGTGTGTTGTTT CTCCTTATGA 660 AAGCCTTTCA GAGCTGAGAG AATATACAGA GGGTCTCACGGAAGCCAAGG AAGACGATGA 720 TGGGGACCAC AGTTCCCTTC AGTCTGGTCA GTCCGTTATCTCCCTGCTGA GCTCAGAAGA 780 ATTAAAAAAA CTCATCGAGG AGGTGAAGGT TCTGGATGAAGCAACATTAA AGGTAGGTTT 840 CCTTTGTAAG CATCTGCAGT AACCAATGGC TTATTATGGCTGTGTGGCCA CCTTAGTTGG 900 GCCAGAGGGG AAGTAGCTTG AGTAGCCTGC CACATCAGACCCAGGTTGCG TCCTGTGATG 960 GTGGGACACT GTAGCACTTG ACCACAGTAA GACCTTCCATTTGAAGAGAG CCTTTTAGCT 1020 TGTGAACCAC TTTCAGTAGA TTGACTTCTT GCATCTTCTTTTGTCATTTT ATAAATGAGA 1080 AAGGTAAGGC TCAATCAAGG CTACAGAACC TGGGTTTTCTGTCTCCAAGT TCAAGTTCAG 1140 TGCTGTTTCC ACATTCCATG TGCTGCTGTC CTGGCATGTGTCTGTTGTGG GATGCTGTCC 1200 ATTGTAAACA ATGTGGGTTA CAAGAGCTCT CACCTGGAGCTTTCATTATT TCCACTGTGC 1260 ATGGAGAGGT GGCTGATCCC AGGGCTCACA AGTCCCCCACGCTTCAGTCA AGTCATTCTG 1320 AAAGTCTCAC TTCCCATATG TTTTCTGAGC ATGACCCAAAGGGGTGTGGG GAGGAAGTGG 1380 CCAGGCTGAG CTGGGGCCAG CAGTCAAATG AGCTCAGGCTCATGGGTCCT GCACCCTCTA 1440 GGTGCTGCCC CAGGCCTCCG TAGGCTTTTG GCACTAGAATGATCCAGGCT AGGATGAAGA 1500 GGATAAGGAG GTTCTCGTTT TCCATACAAG GAGGCCTCATAGCTGCAATT TCCACATCAA 1560 GAGTGTAGGT GAGTCTGATG AGCCCAAGGT GCTGCTGTGCTGAGATTCTT TCGGCTGTGG 1620 CTTTCACTTG TCACCTGGGA CCATCATCCC CCAGGATCTTACTCAGTGCA AATGAAATAA 1680 CAGAGGCAGA GCGTGTAAAA CACAAAGAGC CATCCTGCCTGAGCTGCTCT GGGGAGAGTA 1740 TTTGCTTTCT AACATGAGAA GAGCCTTCTA CAAGGCAAGTAACCTGATAC TTGGGTAAAA 1800 GTTGAGAGAG TTGGGCTAGT GTTGGGGCTT GGAGGTGAGGGTGCAGTGAG GTACATTCAT 1860 CCTTCCATGC CTTTGGGTCT TAGGGGCTCC AAGTCTTAGGATCATAGGGA CAGCTGGAAG 1920 TCAGGTGCTC TAGTGACGCT GAGCAAGTGA ATTCTTTGACATAAGTTTAC TCCTTAGTGC 1980 CAAGGTACAA ACAGGTGCCC CAAGAACCTG TAGGTTTACTTTATTTGGTC TGCATGGTGA 2040 TGAAAAAAAT ATTGAATTCT ATACATGATA AAACCTGAATTGAAACCTGG ACTTTAGGGA 2100 AGTGATCTGG TAGCGCTAGT TCTGTATGCC TACGTAGAGCTGACCCTTTG AGCAGATGTA 2160 CTCGTTGGCG CTCTGCAGTT CCTATCACAC ATCTGCCCATTTGGCTCATT TTAGGGACCT 2220 GCCTGACTCC TATAGGCATC TGAGTTTGAG ACCCCTGCTCTAGACTGGAA TAGGAGTCTC 2280 TGACTGTGTC CTGGCTCCAT GGGAGTCCCC GTCTAGGCTAGGAAGTACCG TAGTAATGTG 2340 TGTGTGTGTG TGTGTGTGTG TGTGTGTGTC ACACTTGCACACTGTGCATT GGGGCAGGAT 2400 GTTAGCTGGG CTTCCTTAGT GCTGCTGCTG TGACCCATGGAGAAGTAGAA GGGAAGAAGG 2460 AGCAACCAAT TCCTGCAGAA CAGCACTGAC CCCTGTTTTGTTTTTTGTTG TTTTTTTTTG 2520 TTTTGTTTTA CCTGAAGTCC TACAACCTGA CTTCATCTCACACTGTCCAA TATGCTGATT 2580 TCTGGCTGAC TTCATGGCAC TCCCCCTGCC CGGCTGTGGACAGGGTGAAT GAGAGAGGAA 2640 AATAATTATG CTTGCTGCTT TACATACATT TTTTTTTTCTTCTAAGCTTC CCATGACTCC 2700 TGAAGGTCCA TTCTTTACAG ATGAGGAAAC TGAGGTTTGAGGAGGTGATG TAACTTGGAG 2760 GCTGGCCAAG CTGGGGTTTG AGATAACAAA TCAGTCTGATGTCAGTCCGA TGTTAAATTG 2820 TTCATCCTCT TGCAGTAAAA TGTTTTTGGA TGTATGTATTTCCTCTGCAG CAATTAGACG 2880 GCATCCATGT CACCATCTTA CACAAGGAGG AAGGTGCTGGTCTTGGGTTC AGCTTGGCAG 2940 GAGGAGCAGA TCTAGAAAAC AAGGTGATTA CGGTGAGTGGCCAAGTGAAG GGGCATGTCA 3000 CAGCCAGAGG CAATGGTTCT GGGGGAGGGG GACACACTTGCCAGGAAGGG GCCCTGTGCT 3060 GGGGAAATGA AGAATGCATG ACACTAGGCC ACTGGGCAGGTCCTGTCCAC TCAGCACATC 3120 CCAGAGCCTG GGGCTGCGTG GAGAGGGTAG CAGGCCTGGCCATGGGCATC TTTTCCTGTG 3180 GGTCCCACTA TTCTGGCTCA TCCAATCTGA TCAGCATTGGCTGCTGCCTT CAGGTCACCT 3240 GTACCTGACC CAGATGGTTT CTGGTTCTGC CAGTTTTGTGGAGCCATGCT GCGGCTGCTC 3300 GCTCTCTAAA GCCGAGTGCA TTGCTGTCAT CCCAGGGCTGTGTTGTCTCA GGGTATCCTT 3360 TGTGTAGGCT GTGCTGGGCT CATTTGAATT TCCATGCCCAACTGAAAACA AATCCTCCAG 3420 TTCACAGCAT CAGCCAGCAT TCAACATACC ACACCCCCTTGCAGTGGCAA TCTGGCATGT 3480 TCCTGCGTGT CACTTCAGAG TCAATCATGT CAGTGGTGACTTCCTTGATT TCCTGATAAG 3540 TTTTCTATCA CATAAAAAAC ACTTAAAACC GGTAAGTCTCTATTTCTCTC ACTGAGTGCA 3600 GCTGAGTATT ACAAAAAGAT TCCTGACCGT GTAGTTTACTTTCTACTTGA AGAGGAGGAA 3660 AGAGAGCTTG CCTGTGGGAA TGGCACTTTG GGTATTTTTCTCTGTCCATG AGTAGCAACT 3720 TCTGTCCACG TCATCTGGCC AGTCACCCTT GAGACACTGCAGACAACAGG AAAATAGGAG 3780 GAAGGCGCAC ATGTTGGCTG GGCACATGCA CAAAAGTTCTTTCTCCTTCT GTGTTTGAGC 3840 ATTTCTCTTC CTTTCCAGAT GATTAGAAGG GAACTAACGTAGAGCACCAT CCACGGCCAT 3900 GCTGAGCACT CACTGACCTG TTGTCTAAAT TCATCCTACCCACACTTTGG ATTGATAAAT 3960 TGGTGGCATT TATGCTTCTT TTATAGAGGA GGAACCTGAGGTTCCTCCTT TAATTAACTT 4020 TAATCTTCGT TGGCCTAAAA ATCCTCTAGC ATTGAGGAACCTGAGGCTCA GAGGAGAGGT 4080 TCCGCTCCAT GCTGGATGTT ACAGCCTGGG GATTCTAACCATGTAACAGA ATTTTTCTAA 4140 CACCAAAACA ATTTAGAGAA GCAAAGAGCT TTGCTCCTATCATAAAAGCA AAACTACAGG 4200 TACCACATTT TAGTGGTTTC CATGCATGCT ATCACTCAGTCCTCTTAATT ATAATGGACC 4260 TCATTAAAGA GGCTGAGGCA GAGACATGAG ATATTTTTGTGTGTTTGTTT ATCCCACATA 4320 TCTTGCAGAA AGGGGACCAA GAGGTGACTG GAGGTAAAGAGTCAGAATTT CTAGGGGAGG 4380 AGCTATAAAA ATGTCTAGAC TGCCTAGGAG AGTGTTTCTCAAAATGCATT CTGCCAAATG 4440 AGACCGAGAA TTTCTCTATG AGAAAAGAAT TCTTTGTTGAAACACTTTGG GAATCCCCAT 4500 AATACCCTGC CAACTTAGAA ATCTGCCATG CAGATTTGCATTGTAGACCC TCTAAATGCC 4560 TTTTGCTTAG AAATCTGCTG CCATGCAAAT TTGCATTGTAGACCCTCTAA TGCCTTTTGC 4620 TTTTTAAAAG TAGACAATGT CAGAGCTTTT GTTTCACCCAGTATTCCTCA AGTTTCTTTG 4680 ATTTTAAAAA ATTTTTCTTG GCCGGGTGTG GTGGCTCATACCTGTAATTC CAACACTGTG 4740 CGGGGCCAAG GTGAGAGGAT TGCTTGAGCC CAGGAGTTTGAGACCATCCT GGGCAACACT 4800 GGGAGACCTG TCTCTATTTT TTAAAAAATA ATAGGAAAAGCCTTTTTCTT ACACAATACC 4860 CGTTAATATG CCATAGAATC AGTGCCTTGA GAAAACTACTCTGGGGACTT CTGACCTAGG 4920 GCAGGTGAAG CAAAAGATTT TATATGGAAT CCCAACTAGAATCGTGGTGG TACACTATAG 4980 GACGTTGTGT TGGGATGGAT TCTGAGGGCT TACCTGGTCATTACTGCTGG TGATCTCTGC 5040 TCTGGATGGA GAAGGAGGGA ATGCTGGCCT CTGTGCCAGCAGCTCCAATC TAGGACACAA 5100 TTATCTTTAA TCTTTGTTGG CCTAAAAATC CTCTAGCATTGACTAACCGG TTCAATCCTC 5160 CTCCAGCAAG TATGTGGACT GGACTTGTGT GATTTCTGGTCCTGACTTCC TTTGGTTTGC 5220 TCAGGTTCAC AGAGTGTTTC CAAATGGGCT GGCCTCCCAGGAAGGGACTA TTCAGAAGGG 5280 CAATGAGGTT CTTTCCATCA ACGGCAAGTC TCTCAAGGGGACCACGCACC ATGATGCCTT 5340 GGCAATCCTC CGCCAAGCTC GAGAGCCCAG GCAAGCTGTGATTGTCACAA GGAAGCTGAC 5400 TCCAGAGGCC ATGCCCGACC TCAACTCCTC CACTGACTCTGCAGCCTCAG CCTCTGCAGC 5460 CAGTGATGTT TCTGTAGAAT CTAGTAAGTT CTCCCAACTCAGTGGAAGCC ACATGGGCCA 5520 CATCCTCTTT GGCCATTTGG GGCCAGACCT GATGGGGCTACTCAGTAATT TGTGACCCCA 5580 AGAATGTGTG GCTGCCTAGT ACACTGCCTG AGACGTGTTTACATGTGCCT GTGTGCAAAC 5640 ACGGGGGCTG TATCACCCCG GGCTCACTTG AAGCCCAGGGCATCTGTGGC CTGGGGAGAG 5700 GAGAGGATCC CTAACAGAGA CCTTGTGTTT TTCTCAGCAGCAGAGGCCAC AGTCTGCACG 5760 GTGACACTGG AGAAGATGTC GGCAGGGCTG GGCTTCAGCCTGGAAGGAGG GAAGGGCTCC 5820 CTACACGGAG ACAAGCCTCT CACCATTAAC AGGATTTTCAAAGGTGTGGG GTGTGTCTGG 5880 TTCTTTGCGT GCTCTCCAGT TGTGGGCATG TGGCCAGGCCCCCAAAAGGC TTCTGGGCAC 5940 TTTCTGGGCT ATGTTGTTTC CCACAACTCC ATGTCCTCTTCATAGGCATG CTGGTCCTTT 6000 TAGGGCTCAA TTCTGCTTTT TCTACTTTTT CTCCTTTGCTCAGACATCCC CTCAATCCCC 6060 CCTCTGTTTT GATGGGTCTT CAAAAATACC TAAGTCCTGGGCTTGGTTCG GGTTGGCAGG 6120 GCCAGGACTC TAGAGTGGGG CAGTGAGGCA CTGGCCTGTGGGGCAGAATT TTAAAGGGGT 6180 GCCAAAACAC TCAGTAACTC AGATCGATAC TATTTTAATGCAGCGTGTTT TTTAAAATTA 6240 ATTTTAAAAA AACATGTTGG GACAAAATAT CCAAGTTTTAAATCAAGACA GAGTCTGACT 6300 TTGTACTGCA CACTTGGCCT CATTTGCCTT ACCCTAGTCCTGGACACGTC AGCTCCTGCC 6360 TTTATTTAAA ATGTTGATAG ATATTTTGTT CATCAGGGATTGGAGTACAA ACCAGTCTGA 6420 TATGGGGGTC ACTTGGATTT CCCTGTGAAA ATCATGAATGACTGTGGCTA CCATGTAAAA 6480 CCATCCCTGA TTCTTTGGTG TTCCTCAAAT TGGAGGTCTCCAAGCCACAG AGCAAGGGGT 6540 TGTAGAGAGA GGAGTACTGG ACAGGGAGGC AGGTGGGCCAGGTTCTAGTC CCAGCTCTGC 6600 CTGTAATGTG CTGAGTGACC CTTCCCTTCT GGGCCTGAGGCTCTTCATCC ATAAAAGGAG 6660 GTAAAGAGGT ACAGGTGTGT GTCTGAGGGC TCTTAGGACTGAGACCCAAA GGGACTCTTA 6720 GCTCTGTCCC TCACCCACTA TGAGCTCCTG CTGCTGACTGGTTTCGTTAG AGGAAGTTCT 6780 GGCTGCGGCT GCAGAAACCC AGAAGGTAGA GTGAGGCTTACATGGCATTC CCCCCAGAAT 6840 CCATGTTAAC CCCAATTCTG GGAAAGATAT TTCTAATTTTTGAAGGTCAA TTTGGAAGGA 6900 GCATTGGGTT CAATGTCAAG AGGACTAGAT TCCAGTGTTAGTTCTGCCAC ATGACCTTGG 6960 TAGCATGATC TTGGACAAGT CACTTCACCA CCATGGGCCCATTTGCTTAA ATGTTTAGGA 7020 TGAGACTGCC AGCTGCAGGG TGATGTTGGA AGGAGAGATGCAGATTCTGG AGCCAGAAAG 7080 TCTGGGTTCC AGCCCAGGGC CCACCACTAG CAGCTATGAAGCTCTGGGCC AGTTACTTGA 7140 GTTCTTGGTT TCCTCAGCTG TTAAAAGGAA ACACAAATAATACACCCCTC ATAGGATTAC 7200 TGTCATAAAT GCAAAACATT AGCACAACGC CTGTTAAAATAATTGCCCAA TACACTTTAG 7260 CTATATTTTC ATTACTATCA TTAGTATTAT CTTCTACTCTTATCAGGATT TGTGAAGATC 7320 AACTGTGTCA AATGGATGGG AAATTTTATT TTAATATAAACAGTAAAATA GCATTGTTTT 7380 CACTTGCAGC TTTGAAATAG TGGGGGCCAT ATATGGTTGTTTCCTTTTTT ATGTGGACAC 7440 AGAGGACTTC GTGCCAGAGG CAAGATCCCT GTAAATATTGTTGCACAAAA ATCTCACTAG 7500 CTCTCTTCCC ATACCACCCA ATGCTGATGT CCTCACCACATGCGGAGAAC AAATGTGAAG 7560 GGAGTAGGAT ATTGGGTCAG TGTCCAAAGC AGGGTCTGGGCAGGACTCAG CTCCCCAGAG 7620 TCCTCTATGA ACTATGGACG GTGCTCCAGG CAGGCTAAGGCGTGGAGCTG CCTGATATTT 7680 CCCTCCCCTG GGGACAGCAA GGGCTATCCC TTTCCAAAGGCCATGGAGAG CTGGAGCCTG 7740 GTGCCCTAAC TTTTGAGTCA CCATCTTAAG AGATGCCTCATTTTAGAACC ACCAACAAGC 7800 AAGCTCCCAA GGGATGGTGC CCTGTTCTCT ACCAAGCTATCCTGGCTCTT TGGAGATCAA 7860 GGAGAGGAGG CAACTTTCCT TGTTCCCCAT CATCTGTGGAACCCATTACC TTCTCCCTCA 7920 TTTCAGGAGC AGCCTCAGAA CAAAGTGAGA CAGTCCAGCCTGGAGATGAA ATCTTGCAGC 7980 TGGGTGGCAC TGCCATGCAG GGCCTCACAC GGTTTGAAGCCTGGAACATC ATCAAGGCAC 8040 TGCCTGATGG ACCTGTCACG ATTGTCATCA GGAGAAAAAGCCTCCAGTCC AAGGAAACCA 8100 CAGCTGCTGG AGACTCCTAG GCAGGACATG CTGAAGCCAAAGCCAATAAC ACACAGCTAA 8160 CACACAGCTC CCATAACCGC TGATTCTCAG GGTCTCTGCTGCCGCCCCAC CCAGATGGGG 8220 GAAAGCACAG GTGGGCTTCC CAGTGGCTGC TGCCCAGGCCCAGACCTTCT AGGACGCCAC 8280 CCAGCAAAAG GTTGTTCCTA AAATAAGGGC AGAGTCACACGGGGGCAGCT GATACAAATT 8340 GCAGACTGTG TAAAAAGAGA GCTTAATGAT AATATTGTGGTGCCACAAAT AAAATGGATT 8400 TATTAGAATT TCATATGACA TTCATGCCTG GCTTCGCAAAATGTTTCAAG TACTGTAACT 8460 GTGTCATGAT TCACCCCCAA ACAGTGACAT TTATTTTTCTCATGAATCTG CAATGTGGGC 8520 AGAGATTGGA ATGGGCAGCT CATCTCTGTC CCACTTGGCATCAGCTGGCG TCATGCAAAG 8580 TCATGCAAAG GCTGGGACCA CGTGAGATCA TTCACTCATACATCTGGCCG TTGATGTTGG 8640 CTGGGAACTC ACCTGGGGCT GCTGGCCTGA ATGCTTATAGGTGGCCTCTC CTTGTGGCCT 8700 GGCCTCCTCA CAACATGGTG TCTGGATTCC CAGGATGAGCATCCCAGGAT CGCAAGAGCC 8760 ACGTAGAAGC TGCATCTTGT TTATACCTTT GCCTTGGAAGTTGCATGGCA TCACCTCCAC 8820 CATACTCCAT CAGTTAGAGC TGACACAAAC CTGCCTGGGTTTAAGGGGAG AGGAAATATT 8880 GCTGGGGTCA TTTATGAAAA ATACAGTTTG TCACATGAAACATTTGCAAA ATTGTTTTTG 8940 GTTGGATTGG AGAAGTAATC CTAGGGAAGG GTGGTGGAGCCAGTAAACAG AGGAGTACAG 9000 GTGAAGCACC AAGCTCAAAG CGTGGACAGG TGTGCCGACAGAAGGAACCA GCGTGTATAT 9060 GAGGGTATCA AATAAAATTG CTACTACTTA CCTACC 90963061 base pairs nucleic acid double linear cDNA CDS 190..2085 3′UTR2086..3061 5′UTR 1..189 6 CTGCTGCCAC TGCTGCTACC ACAGGAAGAC ACAGCAGGGAGAAGCCCTAG TGCCTCTGCC 60 GGCTGCCCAG GACCTGGTAT CGGCCCACAG ACCAAGTCCTCCACAGAGGG CGAGCCAGGG 120 TGGAGAAGAG CCAGCCCAGT GACCCAAACA TCCCCGATAAAACACCCACT GCTTAAGAGG 180 CAGGCTCGG ATG GAC TAT AGC TTT GAT ACC ACA GCCGAA GAC CCT TGG 228 Met Asp Tyr Ser Phe Asp Thr Thr Ala Glu Asp Pro Trp1 5 10 GTT AGG ATT TCT GAC TGC ATC AAA AAC TTA TTT AGC CCC ATC ATG AGT276 Val Arg Ile Ser Asp Cys Ile Lys Asn Leu Phe Ser Pro Ile Met Ser 1520 25 GAG AAC CAT GGC CAC ATG CCT CTA CAG CCC AAT GCC AGC CTG AAT GAA324 Glu Asn His Gly His Met Pro Leu Gln Pro Asn Ala Ser Leu Asn Glu 3035 40 45 GAA GAA GGG ACA CAG GGC CAC CCA GAT GGG ACC CCA CCA AAG CTG GAC372 Glu Glu Gly Thr Gln Gly His Pro Asp Gly Thr Pro Pro Lys Leu Asp 5055 60 ACC GCC AAT GGC ACT CCC AAA GTT TAC AAG TCA GCA GAC AGC AGC ACT420 Thr Ala Asn Gly Thr Pro Lys Val Tyr Lys Ser Ala Asp Ser Ser Thr 6570 75 GTG AAG AAA GGT CCT CCT GTG GCT CCC AAG CCA GCC TGG TTT CGC CAA468 Val Lys Lys Gly Pro Pro Val Ala Pro Lys Pro Ala Trp Phe Arg Gln 8085 90 AGC TTG AAA GGT TTG AGG AAT CGT GCT TCA GAC CCA AGA GGG CTC CCT516 Ser Leu Lys Gly Leu Arg Asn Arg Ala Ser Asp Pro Arg Gly Leu Pro 95100 105 GAT CCT GCC TTG TCC ACC CAG CCA GCA CCT GCT TCC AGG GAG CAC CTA564 Asp Pro Ala Leu Ser Thr Gln Pro Ala Pro Ala Ser Arg Glu His Leu 110115 120 125 GGA TCA CAC ATC CGG GCC TCC TCC TCC TCC TCC TCC ATC AGG CAGAGA 612 Gly Ser His Ile Arg Ala Ser Ser Ser Ser Ser Ser Ile Arg Gln Arg130 135 140 ATC AGC TCC TTT GAA ACC TTT GGC TCC TCT CAA CTG CCT GAC AAAGGA 660 Ile Ser Ser Phe Glu Thr Phe Gly Ser Ser Gln Leu Pro Asp Lys Gly145 150 155 GCC CAG AGA CTG AGC CTC CAG CCC TCC TCT GGG GAG GCA GCA AAACCT 708 Ala Gln Arg Leu Ser Leu Gln Pro Ser Ser Gly Glu Ala Ala Lys Pro160 165 170 CTT GGG AAG CAT GAG GAA GGA CGG TTT TCT GGA CTC TTG GGG CGAGGG 756 Leu Gly Lys His Glu Glu Gly Arg Phe Ser Gly Leu Leu Gly Arg Gly175 180 185 GCT GCA CCC ACT CTT GTG CCC CAG CAG CCT GAG CAA GTA CTG TCCTCG 804 Ala Ala Pro Thr Leu Val Pro Gln Gln Pro Glu Gln Val Leu Ser Ser190 195 200 205 GGG TCC CCT GCA GCC TCC GAG GCC AGA GAC CCA GGC GTG TCTGAG TCC 852 Gly Ser Pro Ala Ala Ser Glu Ala Arg Asp Pro Gly Val Ser GluSer 210 215 220 CCT CCC CCA GGG CGG CAG CCC AAT CAG AAA ACT CTC CCC CCTGGC CCG 900 Pro Pro Pro Gly Arg Gln Pro Asn Gln Lys Thr Leu Pro Pro GlyPro 225 230 235 GAC CCG CTC CTA AGG CTG CTG TCA ACA CAG GCT GAG GAA TCTCAA GGC 948 Asp Pro Leu Leu Arg Leu Leu Ser Thr Gln Ala Glu Glu Ser GlnGly 240 245 250 CCA GTG CTC AAG ATG CCT AGC CAG CGA GCA CGG AGC TTC CCCCTG ACC 996 Pro Val Leu Lys Met Pro Ser Gln Arg Ala Arg Ser Phe Pro LeuThr 255 260 265 AGG TCC CAG TCC TGT GAG ACG AAG CTA CTT GAC GAA AAG ACCAGC AAA 1044 Arg Ser Gln Ser Cys Glu Thr Lys Leu Leu Asp Glu Lys Thr SerLys 270 275 280 285 CTC TAT TCT ATC AGC AGC CAA GTG TCA TCG GCT GTC ATGAAA TCC TTG 1092 Leu Tyr Ser Ile Ser Ser Gln Val Ser Ser Ala Val Met LysSer Leu 290 295 300 CTG TGC CTT CCA TCT TCT ATC TCC TGT GCC CAG ACT CCCTGC ATC CCC 1140 Leu Cys Leu Pro Ser Ser Ile Ser Cys Ala Gln Thr Pro CysIle Pro 305 310 315 AAG GAA GGG GCA TCT CCA ACA TCA TCA TCC AAC GAA GACTCA GCT GCA 1188 Lys Glu Gly Ala Ser Pro Thr Ser Ser Ser Asn Glu Asp SerAla Ala 320 325 330 AAT GGT TCT GCT GAA ACA TCT GCC TTG GAC ACA GGG TTCTCG CTC AAC 1236 Asn Gly Ser Ala Glu Thr Ser Ala Leu Asp Thr Gly Phe SerLeu Asn 335 340 345 CTT TCA GAG CTG AGA GAA TAT ACA GAG GGT CTC ACG GAAGCC AAG GAA 1284 Leu Ser Glu Leu Arg Glu Tyr Thr Glu Gly Leu Thr Glu AlaLys Glu 350 355 360 365 GAC GAT GAT GGG GAC CAC AGT TCC CTT CAG TCT GGTCAG TCC GTT ATC 1332 Asp Asp Asp Gly Asp His Ser Ser Leu Gln Ser Gly GlnSer Val Ile 370 375 380 TCC CTG CTG AGC TCA GAA GAA TTA AAA AAA CTC ATCGAG GAG GTG AAG 1380 Ser Leu Leu Ser Ser Glu Glu Leu Lys Lys Leu Ile GluGlu Val Lys 385 390 395 GTT CTG GAT GAA GCA ACA TTA AAG CAA TTA GAC GGCATC CAT GTC ACC 1428 Val Leu Asp Glu Ala Thr Leu Lys Gln Leu Asp Gly IleHis Val Thr 400 405 410 ATC TTA CAC AAG GAG GAA GGT GCT GGT CTT GGG TTCAGC TTG GCA GGA 1476 Ile Leu His Lys Glu Glu Gly Ala Gly Leu Gly Phe SerLeu Ala Gly 415 420 425 GGA GCA GAT CTA GAA AAC AAG GTG ATT ACG GTT CACAGA GTG TTT CCA 1524 Gly Ala Asp Leu Glu Asn Lys Val Ile Thr Val His ArgVal Phe Pro 430 435 440 445 AAT GGG CTG GCC TCC CAG GAA GGG ACT ATT CAGAAG GGC AAT GAG GTT 1572 Asn Gly Leu Ala Ser Gln Glu Gly Thr Ile Gln LysGly Asn Glu Val 450 455 460 CTT TCC ATC AAC GGC AAG TCT CTC AAG GGG ACCACG CAC CAT GAT GCC 1620 Leu Ser Ile Asn Gly Lys Ser Leu Lys Gly Thr ThrHis His Asp Ala 465 470 475 TTG GCA ATC CTC CGC CAA GCT CGA GAG CCC AGGCAA GCT GTG ATT GTC 1668 Leu Ala Ile Leu Arg Gln Ala Arg Glu Pro Arg GlnAla Val Ile Val 480 485 490 ACA AGG AAG CTG ACT CCA GAG GCC ATG CCC GACCTC AAC TCC TCC ACT 1716 Thr Arg Lys Leu Thr Pro Glu Ala Met Pro Asp LeuAsn Ser Ser Thr 495 500 505 GAC TCT GCA GCC TCA GCC TCT GCA GCC AGT GATGTT TCT GTA GAA TCT 1764 Asp Ser Ala Ala Ser Ala Ser Ala Ala Ser Asp ValSer Val Glu Ser 510 515 520 525 ACA GCA GAG GCC ACA GTC TGC ACG GTG ACACTG GAG AAG ATG TCG GCA 1812 Thr Ala Glu Ala Thr Val Cys Thr Val Thr LeuGlu Lys Met Ser Ala 530 535 540 GGG CTG GGC TTC AGC CTG GAA GGA GGG AAGGGC TCC CTA CAC GGA GAC 1860 Gly Leu Gly Phe Ser Leu Glu Gly Gly Lys GlySer Leu His Gly Asp 545 550 555 AAG CCT CTC ACC ATT AAC AGG ATT TTC AAAGGA GCA GCC TCA GAA CAA 1908 Lys Pro Leu Thr Ile Asn Arg Ile Phe Lys GlyAla Ala Ser Glu Gln 560 565 570 AGT GAG ACA GTC CAG CCT GGA GAT GAA ATCTTG CAG CTG GGT GGC ACT 1956 Ser Glu Thr Val Gln Pro Gly Asp Glu Ile LeuGln Leu Gly Gly Thr 575 580 585 GCC ATG CAG GGC CTC ACA CGG TTT GAA GCCTGG AAC ATC ATC AAG GCA 2004 Ala Met Gln Gly Leu Thr Arg Phe Glu Ala TrpAsn Ile Ile Lys Ala 590 595 600 605 CTG CCT GAT GGA CCT GTC ACG ATT GTCATC AGG AGA AAA AGC CTC CAG 2052 Leu Pro Asp Gly Pro Val Thr Ile Val IleArg Arg Lys Ser Leu Gln 610 615 620 TCC AAG GAA ACC ACA GCT GCT GGA GACTCC TAG GCAGGACATG CTGAAGCCAA 2105 Ser Lys Glu Thr Thr Ala Ala Gly AspSer * 625 630 AGCCAATAAC ACACAGCTAA CACACAGCTC CCATAACCGC TGATTCTCAGGGTCTCTGCT 2165 GCCGCCCCAC CCAGATGGGG GAAAGCACAG GTGGGCTTCC CAGTGGCTGCTGCCCAGGCC 2225 CAGACCTTCT AGGACGCCAC CCAGCAAAAG GTTGTTCCTA AAATAAGGGCAGAGTCACAC 2285 GGGGGCAGCT GATACAAATT GCAGACTGTG TAAAAAGAGA GCTTAATGATAATATTGTGG 2345 TGCCACAAAT AAAATGGATT TATTAGAATT TCATATGACA TTCATGCCTGGCTTCGCAAA 2405 ATGTTTCAAG TACTGTAACT GTGTCATGAT TCACCCCCAA ACAGTGACATTTATTTTTCT 2465 CATGAATCTG CAATGTGGGC AGAGATTGGA ATGGGCAGCT CATCTCTGTCCCACTTGGCA 2525 TCAGCTGGCG TCATGCAAAG TCATGCAAAG GCTGGGACCA CGTGAGATCATTCACTCATA 2585 CATCTGGCCG TTGATGTTGG CTGGGAACTC ACCTGGGGCT GCTGGCCTGAATGCTTATAG 2645 GTGGCCTCTC CTTGTGGCCT GGCCTCCTCA CAACATGGTG TCTGGATTCCCAGGATGAGC 2705 ATCCCAGGAT CGCAAGAGCC ACGTAGAAGC TGCATCTTGT TTATACCTTTGCCTTGGAAG 2765 TTGCATGGCA TCACCTCCAC CATACTCCAT CAGTTAGAGC TGACACAAACCTGCCTGGGT 2825 TTAAGGGGAG AGGAAATATT GCTGGGGTCA TTTATGAAAA ATACAGTTTGTCACATGAAA 2885 CATTTGCAAA ATTGTTTTTG GTTGGATTGG AGAAGTAATC CTAGGGAAGGGTGGTGGAGC 2945 CAGTAAACAG AGGAGTACAG GTGAAGCACC AAGCTCAAAG CGTGGACAGGTGTGCCGACA 3005 GAAGGAACCA GCGTGTATAT GAGGGTATCA AATAAAATTG CTACTACTTACCTACC 3061 631 amino acids amino acid linear protein 7 Met Asp Tyr SerPhe Asp Thr Thr Ala Glu Asp Pro Trp Val Arg Ile 1 5 10 15 Ser Asp CysIle Lys Asn Leu Phe Ser Pro Ile Met Ser Glu Asn His 20 25 30 Gly His MetPro Leu Gln Pro Asn Ala Ser Leu Asn Glu Glu Glu Gly 35 40 45 Thr Gln GlyHis Pro Asp Gly Thr Pro Pro Lys Leu Asp Thr Ala Asn 50 55 60 Gly Thr ProLys Val Tyr Lys Ser Ala Asp Ser Ser Thr Val Lys Lys 65 70 75 80 Gly ProPro Val Ala Pro Lys Pro Ala Trp Phe Arg Gln Ser Leu Lys 85 90 95 Gly LeuArg Asn Arg Ala Ser Asp Pro Arg Gly Leu Pro Asp Pro Ala 100 105 110 LeuSer Thr Gln Pro Ala Pro Ala Ser Arg Glu His Leu Gly Ser His 115 120 125Ile Arg Ala Ser Ser Ser Ser Ser Ser Ile Arg Gln Arg Ile Ser Ser 130 135140 Phe Glu Thr Phe Gly Ser Ser Gln Leu Pro Asp Lys Gly Ala Gln Arg 145150 155 160 Leu Ser Leu Gln Pro Ser Ser Gly Glu Ala Ala Lys Pro Leu GlyLys 165 170 175 His Glu Glu Gly Arg Phe Ser Gly Leu Leu Gly Arg Gly AlaAla Pro 180 185 190 Thr Leu Val Pro Gln Gln Pro Glu Gln Val Leu Ser SerGly Ser Pro 195 200 205 Ala Ala Ser Glu Ala Arg Asp Pro Gly Val Ser GluSer Pro Pro Pro 210 215 220 Gly Arg Gln Pro Asn Gln Lys Thr Leu Pro ProGly Pro Asp Pro Leu 225 230 235 240 Leu Arg Leu Leu Ser Thr Gln Ala GluGlu Ser Gln Gly Pro Val Leu 245 250 255 Lys Met Pro Ser Gln Arg Ala ArgSer Phe Pro Leu Thr Arg Ser Gln 260 265 270 Ser Cys Glu Thr Lys Leu LeuAsp Glu Lys Thr Ser Lys Leu Tyr Ser 275 280 285 Ile Ser Ser Gln Val SerSer Ala Val Met Lys Ser Leu Leu Cys Leu 290 295 300 Pro Ser Ser Ile SerCys Ala Gln Thr Pro Cys Ile Pro Lys Glu Gly 305 310 315 320 Ala Ser ProThr Ser Ser Ser Asn Glu Asp Ser Ala Ala Asn Gly Ser 325 330 335 Ala GluThr Ser Ala Leu Asp Thr Gly Phe Ser Leu Asn Leu Ser Glu 340 345 350 LeuArg Glu Tyr Thr Glu Gly Leu Thr Glu Ala Lys Glu Asp Asp Asp 355 360 365Gly Asp His Ser Ser Leu Gln Ser Gly Gln Ser Val Ile Ser Leu Leu 370 375380 Ser Ser Glu Glu Leu Lys Lys Leu Ile Glu Glu Val Lys Val Leu Asp 385390 395 400 Glu Ala Thr Leu Lys Gln Leu Asp Gly Ile His Val Thr Ile LeuHis 405 410 415 Lys Glu Glu Gly Ala Gly Leu Gly Phe Ser Leu Ala Gly GlyAla Asp 420 425 430 Leu Glu Asn Lys Val Ile Thr Val His Arg Val Phe ProAsn Gly Leu 435 440 445 Ala Ser Gln Glu Gly Thr Ile Gln Lys Gly Asn GluVal Leu Ser Ile 450 455 460 Asn Gly Lys Ser Leu Lys Gly Thr Thr His HisAsp Ala Leu Ala Ile 465 470 475 480 Leu Arg Gln Ala Arg Glu Pro Arg GlnAla Val Ile Val Thr Arg Lys 485 490 495 Leu Thr Pro Glu Ala Met Pro AspLeu Asn Ser Ser Thr Asp Ser Ala 500 505 510 Ala Ser Ala Ser Ala Ala SerAsp Val Ser Val Glu Ser Thr Ala Glu 515 520 525 Ala Thr Val Cys Thr ValThr Leu Glu Lys Met Ser Ala Gly Leu Gly 530 535 540 Phe Ser Leu Glu GlyGly Lys Gly Ser Leu His Gly Asp Lys Pro Leu 545 550 555 560 Thr Ile AsnArg Ile Phe Lys Gly Ala Ala Ser Glu Gln Ser Glu Thr 565 570 575 Val GlnPro Gly Asp Glu Ile Leu Gln Leu Gly Gly Thr Ala Met Gln 580 585 590 GlyLeu Thr Arg Phe Glu Ala Trp Asn Ile Ile Lys Ala Leu Pro Asp 595 600 605Gly Pro Val Thr Ile Val Ile Arg Arg Lys Ser Leu Gln Ser Lys Glu 610 615620 Thr Thr Ala Ala Gly Asp Ser 625 630

What is claimed is:
 1. An isolated nucleic acid encoding a polypeptidewith interleukin-16 activity, comprising a nucleic acid sequenceselected from the group consisting of (1) at least nucleotide sequence2053 to 3195 of SEQ ID NO: 1; (2) the nucleic acid sequence of SEQ IDNO: 5; (3) the nucleic acid sequence of SEQ ID NO: 6; and (4) a nucleicacid sequence which encodes the amino acid sequence of SEQ ID NO:
 7. 2.A process for producing a polypeptide with interleukin-16 activity,comprising (a) transforming or transfecting a host cell with a nucleiacid according to claim 1, to obtain a transformed or transfected hostcell; (b) culturing the transformed or transfected host cell to obtain acell culture; (c) expressing the nucleic acid in the transformed ortransfected host cell to produce the polypeptide; and (d) isolating thepolypeptide from the cell culture.
 3. The process of claim 2, whereinthe host cell is a prokaryotic cell.
 4. The process of claim 2, whereinthe host cell is a eukaryotic cell.
 5. A process for producing apolypeptide with interleukin-16 activity, comprising (a) providing avector containing a nucleic acid according to claim 1 and regulatoryelements necessary to express the nucleic acid in a eukaryotic hostcell; (b) transforming or transfecting a eukaryotic cell with the vectorto obtain a transformed or transfected host cell; (c) culturing thetransformed or transfected host cell to obtain a cell culture; (d)expressing the nucleic acid in the transformed or transfected host cellto produce the polypeptide; and (e) isolating the polypeptide from thecell culture.
 6. A host cell which is transformed or transfected with anucleic acid according to claim 1, wherein the host cell expresses thepolypeptide with interleukin-16 activity.
 7. The host cell of claim 6,wherein the host cell is a prokaryotic cell.
 8. The host cell of claim6, wherein the host cell is a eukaryotic cell.
 9. A vector containing anucleic acid according to claim
 1. 10. A polypeptide with interleukin-16activity, wherein the polypeptide can be expressed using a nucleic acidaccording to claim 1, wherein the polypeptide has a molecular weight ofabout 14 kD.
 11. A multimeric polypeptide with interleukin-16 activity,wherein the multimeric polypeptide is composed of a plurality ofsubunits, and wherein the subunits comprise a polypeptide according toclaim
 10. 12. The multimeric polypeptide of claim 11, wherein themultimeric polypeptide is composed of two, four or eight subunits.
 13. Acomplementary nucleic acid of the nucleic acid of claim
 1. 14. Apharmaceutical composition, comprising a nucleic acid according to claim1, in combination with a pharmaceutically acceptable carrier.
 15. Anisolated nucleic acid which can hybridize with the nucleic acid of claim1, at hybridization conditions selected from the group consisting of:(a) low stringency hybridization conditions characterized by ahybridizing step in 6.0×SSC at about 45° C. followed by a washing stepin 2.0×SSC at 22-50° C.; and (b) high stringency hybridizationconditions characterized by a hybridizing step in 6.0×SSC at about 45°C. followed by a washing step at 0.2×SSC at 50-65° C. wherein theisolated nucleic acid encodes for a polypeptide having a length of 631amino acids similar or equal to the sequence of SEQ ID NO:
 7. 16. Anisolated nucleic acid which can hybridize with the nucleic acid of claim1, at hybridization conditions selected from the group consisting of:(a) low stringency hybridization conditions characterized by ahybridizing step in 6.0×SSC at about 45° C. followed by a washing stepin 2.0×SSC at 22-50° C.; and (b) high stringency hybridizationconditions characterized by a hybridizing step in 6.0×SSC at about 45°C. followed by a washing step at 0.2×SSC at 50-65° C. wherein theisolated nucleic acid consists of 7 exons and 6 introns.
 17. An isolatednucleic acid which can hybridize with the nucleic acid of claim 1, athybridization conditions selected from the group consisting of: (a) lowstringency hybridization conditions characterized by a hybridizing stepin 6.0×SSC at about 45° C. followed by a washing step in 2.0×SSC at22-50° C.; and (b) high stringency hybridization conditionscharacterized by a hybridizing step in 6.0×SSC at about 45° C. followedby a washing step at 0.2×SSC at 50-65° C. wherein the isolated nucleicacid contains the intron/exon junctions according to SEQ ID NO: 1.